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We employ vector matrix notation generally. 
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Transpose Notation is not used generally where it is 
clear from the context. However, it is used in the 
linear-quadratic problems since the results are familiar 
in this way. A vector is Positive or Negative or Zero 
if all its components are so respectively. Similarly 
a vector is Less Than (is Below in a vector space 
representation) another if each of its components is 
less than (below) the corresponding component’s of the 
other. This applies to other inequalities as well. 
Variable subscripts indicate partial derivatives. 

The Notation used in the Thesis is indicated in the text. 
However we append here most of the symbols used. 
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h observation function 

H Hamiltonian 
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cooperative Value function 
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terminal cost function 
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order 
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(6) SETS, REGIONS, SURFACES AND CURVES 


JV 
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Double Integral Plant 

R regions in the cooperative solution of the 
Double Integral Plant 

^ cost orthant defined in the Minimum Principle 
for the Pareto optimal solution 
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x-) 
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occurs in the Regular Decomposition of the 
Playing Space 


E set of Equilibrium Points 

M set of Minimax Points 

r,V,A switching curves 
(7) SUPERSCRIPTS AND SUBSCRIPTS 


_ underbar use<? when similar quantities of all 
the players are put together 

o superscript used to refer Pareto optimal or 
Nash cooperative solutions 

o subscript refers to the initial time 

f subscript to indicate final time 

* superscript to refer equilibrium solutions 

p superscript to denote a general player 

T superscript to denote transpose 
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(8) SPECIAL NOTATION 
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over which it is t a ken 

ir m\il tip licat ion symbol with the multiplication 

i ind ex 


2 

3 

c > 

H II, 

0 (k) 


summation symbol and summation index 

denotes differentiation 

norm with respect to the matrix Q 

class of functions which have continuous 
derivatives up to order k 


6 


belongs to 



SYNOPSIS 


UPADRASTA RAVIKIRANA PRASAD, Ph.D. 

Department of Electrical Ehgineering 
Indian Institute of Technology, Kanpur 
December 1969 

N-PERSON DIFFERENTIAL GAMES AND 
MULTI CRITERION OPTIMAL CONTROL PROBLEM S 

Deterministic two-person zero- sum Differential 
Games were first studied by Isaacs, mainly in connection 
with Pur suit- Evasion problems. Berkovitz gave a rigorous 
mathematical foundation to this subject. Recently 
N-Pe.rson Differential Games have started to receive 
attention. In N-Person Games, the payoff function 
vector of the players orders only partially their j oint 
strategies and hence the solution of these games require 
the invoicing of Supercriteria. In this sense, N-Person 
Games are related to problems of Vector Programming and 
Decision-Making Jnder Uncertainty. Extension of these 
interrelationships to a multistage setting under dynamic 
constraints to sturdy N-Person Differential Games and 
Multi criterion Optimal Control Problems provides the 
motivation for this Thesis* 

N-Person Differential Games admit various 
Information Patterns and levels of Cooperation among 



xvi 


the players. The solution concepts of Finite Games are 
generally applicable to these games. Thus the Non cooperative 
Solution is given in terms of Equilibrium and Minimax points, 
and the Cooperative Solution is given by Pareto Optimal 
Points. The application of these concepts in the case of 
deterministic Differential Games is discussed in Chapter II. 

In Chapter III, Necessary Conditions, similar to 
the minimum principle of Pontryagin, are derived for the 
Non cooperative Solution in terms of Equilibrium Points. It 
is established that the optimal strategies of the players 
necessarily induce Equilibrium Points in the Hamiltonians, 
one for each player, with the usually associated Euler- 
Lagrange equations and the Transversali ty and Corner 
conditions. 

A general N- Person Differential Game exhibits a 
variety of switching surfaces similar to those encountered 
in two-person zero-sum games. These can be broadly 
classified into Transition, Singular, Dispersal and 
Abnormal Surfaces. While the Transition Surfaces are 
obtained by the application of comer conditions, the others 
require further conditions for their determination. Tbr 
example, the Singular Surfaces ap?e constructed by the 
application of the Legeh'dr erClebsch condition in its 
generalized form. Chapter IV is devoted to a discussion 
of these surfaces“a^^heir construction. 



The concept of Pareto Optimality is discussed in 
detail in Chapter V along with the Cooperative Solutions 
of Differential Games. The necessary conditions for Pareto 
Optimality show that Pareto Optimal Solutions are obtained 
by solving a class of parametrized optimal control problems 
with the criterion functionals given as convex combinations 
of the payoff functionals of all the players. The various 
cooperative solutions differ in the underlying Supercriteria 
to single out one Pareto optimal point as the solution. It 
is shown that Hulticriterion Optimal Control Problems can be 
solved as Cooperative N-Person Differential Games without 
sidepayments and with equal information to all the players. 

A computational method is suggested for the resulting Nash 
'Cooperative Solution. The solution of the double integral 
plant with the twin objectives of minimizing time and fuel 
is given as a running example of a two-persm nonzero- sum 
game throughout the Thesis. 

The study of Imperfect and Incomplete Information 
Differential Games is initiated by considering Multi criterion 
Optimal Control under Uncertainty in Chapter VI. Because of 
their equal information feature, these problems are easier 
to solve than the corresponding Differential Games. A 
thorough study of such problems requires additional 
mathematical concepts from Stochastic Optimal Control and 
Markov Processes and is suggested for further investigation. 



CHAP TER I 


IN PRODUCTION 

f '■ 

1.1 GENERAL 

It has long been recognized that the Theory of 
Games, which initially arose as a mathematical reduction, 
of Competitive Economic Behaviour (von Neumann and 
Horgenstern 1944), has potential application in many-sided 
complex decision-making problems of great significance. 
T ihile a Mathematical Programming Problem can be viewed as 
a simple decision-making problem, Statistical Decision 
Theory deals with the more complex problem of optimization 
under uncertainty. Two-person zero-sum game theory is 
applied for the solution of this problem, with the 
Uncertainty playing the role of an antagonist to the 
Decision-maker (Blackwell and Girschik 1954). 

The Theory of Control Processes can be considered 
as dealing with multistage decision-making problems under 
dynamic constraints and the Dynamic Programming of 
Bellman (1957) is essentially developed for such problems. 
Pontryagin (1962) gave necessary conditions to the 
deterministic optimal control problem which in turn can be 
considered a generalization of the calculus of variations 
(Hestenes 19 o6). The interrelationships between these 



topics made Control .Theory one of the major disciplines 
of Applied Mathematics in recent times. 
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As uncertainty enters into the formulation of a 
control problem in terms of ignorance of the plant dynamics 
and uncontrollable inputs to the plant (Horowitz 1963), the 
problem of control under uncertainty received the attention 
of many investigators both under deterministic and 
stochastic formulations. In either case, the problem is 
cast as a two*person zero-sum game only recently (Ragade 
and Sarma 1967 and Sworder 1966). 

In any practical control system, it is inherent 
that various basically different requirements are to be met 
in the design. The present Optimal Control Theory is 
restrictive in the assumption that these objectives can be 
reduced to a single criterion for minimization. Zadeh (1963) 
suggested the use of a vector criterion, with its components 
representing the various requirements, to judge the 
performance of a system. The resulting Multi criterion 
Optimal Control Problem (see Chang 1966 ) is a multistage 
generalization of the Vector Programming Problem (Kuhn and 
Tucker 1951 and Da Cunha and Polak 1967) under dynamic 
constraints. 

The seemingly different problems, sketched above, 
have H.ioic similarities. In this chapter, after giving 



a brief review of N-person games, we introduce the 
Multi criterion Optimal Control Problems; We then study 
the interrelationships between the problems of games, 
decision-making under uncertainty and vector programming. 

The objective of this Thesis is to pursue these 
aspects further to a multistage setting and study N-person 
Differential Games and Multi criterion Optimal Control 
Problems under basically similar frameworks. 

1.2 BRIEF REVIEW OF N-PERSON GAMES 

The foundations of the mathematical theory of 
Games of Strategy were laid by von Neumann. Along with 
Morgenstern (1944), he emphasized a new approach to 
Competitive Economic Behaviour through a mathematical 
reduction to suitable Games of Strategy with N 
participants in general. 

N-person games permit different information 
patterns to the players and various levels of cooperation 
among them (Luce and Raiffa 1967). However the study of 
these games as initiated by von Neumann and pursued by 
others later on is largely in the Normal Form which does 
not permit explicit information patterns to the players 
and the game looses its multistage character in this form. 

The main contribution in Game Theory is the 

min-maX theorem for the two-person zero-sum game. N-person 



games with cooperation and sidepayments permitted batwaan 
the players are studied in terms of the Characteristic 
Function Theory (von Neumann and Morgenstern 1944 and 
Luce and Raiffa 1957). While the solution of the two- 
person zero-sum game in terms of saddle points is 
appealing, the Characteristic Function Theory is 
unsatisfactory in many respects (Luce and Raiffa 1957). 

The Solutions may be numerous and ' embarassingly rich' on 
the one hand, while on the other a recent result by 
Lucas (1967) shows that the Solution may not even exist 
for some classes of games. Alternative approaches to the 
concept of Solution are the Value of Shapley, Reasonable 
Outcomes ef K liner an* -Stability (Luce and Raiffa 1957) 

A further approach to N-person games is due to 
Nash ( 19 5i) , who defines the solution of a game with no 
cooperation between the players in terms of its Equilibrium 
Points. By arguing that the noncooperative model is more 
basic since any cooperation between the players can be 
represented as moves in a larger game, he obtains the 
cooperative solution of a two-person game without side- 
payments as the noncooperative solution of the overall 
game (Nash 1953). The ideas and results of Nash are 
fruitfully extended by Harsanyi (1956, 1959 and 1964) to 
cover larger classes of games, connecting them up with the 



earlier theories . of Bargaining in Econometrics. A 
different approach to the N-person cooperative game 
without sidepayments is due to Aumann and Peleg (i960) 
based on a Generalized Characteristic Function. 

Further extensions of the above results have 
been numerous as can be seen in (Kuhn and Tucker 1950 
and 1953, Dresher et.al. 1957, Tucker and Luce 1959 and 
Dresher et.al. 1964) and the references compiled in them. 
Infinite games, in which the players are permitted 
infinite number of strategies, of which the games of 
timing and the games on the unit square are examples 
(see Karlin 1959), are studied by many authors. 

Kuhn (1953) and Dalkey (1953) introduced the 
study of games in Extensive Form which can explicitly 
take care of the imperfectness of the information to the 
players. A player’s information is perfect if and only 
if he possesses at every move the information regarding 
the history of his previous moves and those of the others. 

A particular class of the infinite games with a 
continuum of strategies and a continuum of moves are the 
Differential Games, the two-person zero-sum form of which 
have been studied extensively by Isaacs (1965), Berkovitz 
(1964 and 1967) and others (see Ho et.al. 1969 for an 
excellent compilation of References in the area of 



Differential Games). Since the formulation of Differential 
Games follows on the same lines as the optimal control 
problems except for two or more controlling agencies or 
players, they can be considered as a generalization to the 
optimal control problems and they retain their multistage 
character. Pontryagin (1966) and others (Patsyukov 1968) 
formulated a class of Differential Games with a heirarchical 
form of information to the players. Recently there has been 
interest in Differential Games with N-players (Starr and 
Ho 1969, Prasad and Sarma 1969 and Case 1969). 

While imperfect information games are stuped in 
the literature (Kuhn 1953 and Dalkey 1953) with extensions 
to Differential Games (Beim and Ho I960, Rhodes and 
Luenberger 1969, Rhodes 1969 and Sarma et.al. 1969), 
incomplete information games have not received much 
attention except the recent contributions by Harsanyi (1967 
and 1968) and Ragade (1968). A player has complete 
information if and only if he has complete knowledge of the 
strategic capabilities and objectives of his and other 
players in the game. 

Under a very general framework of a positional 
game, Ragade (1968) establishes the relation between the 
various game models discussed earlier and the multistage 
games such as ftochastic (James, Recursive Games etc. It 



is also interesting to note that there have been attempts 
to consider a game with infinite number of players 
(Kali sc h and Nering 1959 and Kannai 1964). 
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1.3 IN-miOG’HON 10 MUL nOHI 1EBION OPTIMAL CONTROL 

It is well kaown that the state- space method of 
controller design has the advantage of a more realistic 
mathematical formulation of the overall control problem 
involving the essential dynamics and constraints and the 
tasks, the system is supposed to perform. Thus the system 
state x satisfies a vector differential equation 

x = f(x, u, t) (1.1) 

where the control input u is chosen from the restraint 
set ,/v. The system state is transferred from the initial 
condition x(tQ> = to a terminal surface specified by 
o^x f , tf) = 0 (1.2) 

To choose from the variety of control inputs 
which achieve the task satisfying the specified constraints, 
a performance index or a criterion functional, depending 
upon the requirements or the objectives to be met by the 
system, is minimized with respect to the control policies. 
The criterion functional is given by 

tf 

J p [x 0 ,t 0 ,u] * 0 p (x f ,t f ) + J L p (x,u,t) dt (1.3) 

*0 

If J p is a scalar, it introduces a total ordering on the 



control policies and the optimal policy which minimizes 
is chosen. Of course there can be associated problems 
of existence and uniqueness of solutions (Lee and Markus 
1967). 

However there are usually requirements, basically 
different in nature, that enter into the judgement of the 
system performance and the choice of a scalar functional 
is necessarily arbitrary and subjective. One way to over- 
come this difficulty is to propose multiple criteria i.e., 
with p = 1, . . . N in (1.3). The vector J = (J^, ... J 1 *) 
is called the criterion functional vector. Now a system 
m a y be better than another with respect to some components 
of £ and worse with respect to others. In other words, 
the vector criterion "cannot induce a natural (or total) 
ordering on the get of allowable control policies. Vfe 
describe some of the earlier methods of designing a system 
given multiple criteria. 

Nelson (1964) specifies some acceptable bounds, 
similar to the classical design techniques, on all but one 
criterion. The resulting problem is an optimal control 
problem with respect to the free criterion, with several 
i soperimetric constraints. A second method due to 
Chyung (1967) orders all the Criteria according to their 
importance and the optimization is performed in a ; 
heirarchy. In other words, each criterion is applied on 
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the optimal controls resulting from the application of 
the preceding criterion. Lastly, the vector criterion 
is reduced to a single index by attaching suitable 
positive weights to the various components of the vector. 
Mahy problems may not permit the assumptions in these 
methods* 


Now, supposing there is a control law such that 
the performance of the system cannot be improved with 
respect to any component of the vector criterion without 
simultaneously deteriorating the performance with respect 
to the other components, then such a control law obviously 
has a Weak Optimality Characteristic. Such control laws 
are called Noninferior and in general are nonunique. Without 
imposing Super criteria, there does not seem to be a way of 
choosing a particular noninferior policy as the solution of 
the problem. The earlier methods discussed in the preceding 
paragraph can be considered as the application of such 
Supercriteria* 

If uncertainty arises in a control problem in 
whatever form (Horowitz 1963), it can be characterized, to 
a large extent, in a statistical sense. If such a 
characterization is complete and the performance index is 
scalar valued, one can find a control policy which is best 
in the Statistical Expectation Sense. In the case when the 
characterization of the uncertainty is incomplete, the 



p erf ormance index cannot induce a total ordering on the 
set of control policies, even if the performance index is 
scalar- valued. Once again, a solution to this problem is 
obtained only through the application of a Supercriterion 
(Sworder 1966). In the next section, we study the inter- 
relationships between the problems of Vector Programming, 
Games and Decision-making under Uncertainty and the 
application of Supercriteria to solve them. 

1.4 VECTOR PROGRAMING, GAMES AND UNCERTAINTY 

The problem of vector minimization arose in many 
branches of Science and Mathematics. The concept of 
Noninferiority discussed in the earlier section is related 
to the concepts of Admissibility in Statistics (Wald 1950), 
Efficient Solutions in Economics (Karlin 1959), Pareto 
Optimality in G a me Theory (Luce and Raiffa 1957) and 
Minimal Solution in Vector Mathematical Programming (Kuhn 
and Tucker 1951). All these sigaify Weakly Optimal 
Solutions arising as minimal solutions with respect to a 
partial ordering introduced by the vector function on the 
set of allowable control ©^decision variables. Since in 
general such solutions are nonunique, this may be referred 
to as the Dilemma introduced by the partial ordering. To 
resolve this Dilemma, Supercrit eria have to be applied to 
single out the Solution to the Problem from the variety 
of Noninferior Solutions. 
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In the case of a cooperative game, the solution 
which is noninferior should reflect the strategic 
potentialities of the players. For this, the noncooperative 
solution of the game is studied extensively such as the 
powers of the players in forming coalitions, issuing threats 
to the other players etc. (Luce and Raiffa 1957). Thus the 
Sup ere ri ter ion used in obtaining the cooperative solution 
depends upon the non cooperative solution. This procedure 
is made possible because each player’s decision variables 
in a game are known. 

In a general vector minimization problem, such a 
procedure of obtaining the Supercriterion through the 
non cooperative solution is not possible straight away. 
However, if the various decision variables are allocated 
to the different components of the vector function, the 
method applies. This is equivalent to writing, for example 
in (1. l) and (1.3), as 

u = (u 1 , ... u N ) (1.4) 

The solution obtained by such a procedure would have the 
advantage of reflecting the tradeoff factors between the 
various criteria in a game- the ore tic sense for the 
considered allocation. In many problems, the allocation 
can be done in a natural manner and this can be exploited 
fruitfully. 
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In a two-person zero-sum game, no cooperation 

between the two players is possible. In a Decision-making 

■ *. 

Problem under Uncertainty, the dilemma introduced by the 
uncertainty can be resolved by considering uncertainty as 
an intelligent antagonist. Such a solution would be 
optimal under the worst case and is the basis of all 
worst-case designs. Once again the procedure can b© 
thought of as the application of a Supercrit erion and is 
borrowed from Game Theory. 

■ r 

The problem of vector minimization under 
uncertainty can similarly be reduced to a two-person 
zero-sum game with a vector payoff function. The concepts 
of Approachability and Excludability by Blackwell (1956) 
are Weakly Optimal and the solution should be ccntained 
in sets having these properties. 

1.5 OUTLINE OF THE THESIS 

' : $F; 

We give, in the next chapter, a general framework 
to study N-Per son Differential Games and discuss the 
applicability of the concepts of Finite Games to the 
Differential Gapes. 


In Chapter III, Necessary and some Sufficient 
Conditions are presented for the Noncooperative Solution 
of the game along with simple illustrative examples. This 
is followed, in Chapter IV, with the general construction 

of various switching surfaces encountered in these games. 
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An example of a Itouble Integral plant is worked out in 
detail to obtain the noncooperative solution. Many 
phenomena} both encountered in Finite Games and otherwise, 
are shown. 

In Chapters V and VI, we extend the inter- 
relationships studied in Section 1.4 to the multistage 
case and consider Multicriterion Optimal Control Problems 
with and without Uncertainty* It is shown, that these 
problems can be solved as N-Person Cooperative Differential 
Games without Sidepayments and with equal information to all 
the players. In Chapter V, we study the deterministic 
problems and in Chapter VI, problems under uncertainty along 
with some of the current problems of interest in Optimal 
Control Theory* Chapter VII concludes the Thesis. 



CHAPTER II 


BASIC ai’HJCIURE OF N- PERSON DIFFERENTIAL GAMES 

2.1 INTRODUCTION 

In this chapter we develop a general framework to 
study N-person differential games* First we review the 
solution concepts of N-person games, some of which are 
already mentioned in Chapter I. Most of these concepts are 
associated with the Normal Form of the game and depend 
mainly whether the game is played under noncooperation or 
cooperation among the players. Though the Noncooperative 
Solution can be considered in its own right, it is also 
necessary in some form for the Cooperative Solution. T/fe 
follow the references (Nash 1951 and Harsanyi 1964) for the 
Noncooperative Solution and (von Neumann and Morgenstern 
1944, Nash 1953, Harsanyi 1959 and Luce and Raiffa 1957) 
for the Cooperative Solution. 

A fairly general class of Deterministic N-Person 
Differential Games is formulated and extensions to other 
classes are indicated in Section 2.3. This is followed up 
with some basic solution concepts in these games. In 
essence, we show the adaptability of the solution concepts 
of finite games to the present situaM.lp and discuss the 
implications bearing in mind the dynamics of the problem. 
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2.2 CONCEPTS FROM JBE THEORY OF N-PER30N GAMES 

An N-person game arises out of a situation 
involving a Conflict of Interest among a set of N players 
and Game Theory deals with the Decision-Making Problems of 
each of the players in this situation. The structure of 
the conflict* consisting of the Rules of the Game, is best 
represented in a model known as the game in its Extensive 
Form. 

Games in Extensive Form: 

A game in Its extensive form consists of a 

Topological Tree (see Figure 2.1), with a distinguished 

vertex (the hollow node 0 in the figure) representing 

the first move and a Payoff Function defined for each 

player on the terminal vertices (the hollow nodes f]_, 

f 2 ... f^) ) representing the Outcomes of the game. Rirther, 

the nonterminal vertices are partitioned into (N+l) sets 

1 

corresponding to the N players and Nature called the 
Player Sets (labelled with 1, 2, 3, 4 for the players and 

0 for the Nature 2 ). Each player set is further subdivided 
into Information Sets for the respective player (shown by 
broken lines in the figure), the nodes in which are 

1 In a Completely Deterministic Game, there will be no 
player set corresponding to Nature. 

2 The first move may belong to any player in general. 

■ In Figure 2.1 it is Nature T s move. 
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Indistinguishable in terms of the Previous Transitions and 
Current Choices available to him. The nodes represent 
Moves in general, while the branches at anode corresponding 
to a player (say a, b and c at node 0 of Nature) 
represents the Alternatives available to him and the 
Transitions resulting from each Choice of the Alternatives 
(Indistinguishable Choices are represented by the same 
letter in the figure, for example d d T , e e T and g g* 
for flayer 1). A probability distribution governs the 
transitions at a Nature's Move. 

Information Patterns to Players: 

A player is said to have Perfect Information if 
all of his Information Sets contain only one node 
(Players 2 and 3 have Perfect Information in Figure 2.1). 
Otherwise his information is Imperfect (for example 
Players 1 and 4 in the figure). While imperfect 
information is effectively portrayed by the extensive form, 
there is another concept known as Complete Information which 
is not represented in the Extensive Model. Under this 
assumption, a player has Complete Information if he has full 
knowledge of the game in its Extensive Pbrm and a Complete 
Information Game implies Complete Information to all its 
players. 

The objective of each player is to minimize^ his 

— |ig| pun' i|»" ■ iii W}— hp ii ' ■ ' ■ S' 

3 To confirm with control theoretic terminology. 
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payoff function and this assumption goes under the name of 
Rationality in game theory. The other assumptions, made 
outside the Extensive Model, are the con straps regarding 
the Communication and Cooperation between the players. 

Normal Form of a Game; 

A Choice junction mapping any player’ s information 
sets into his corresponding alternatives at the moves in 
the information set is called a Pure Strategy of the player. 
Given the game in its extensive form, one can enumerate all 
the pure strategics for the various players. A game in its 
Normal Fbrm is given by the pure strategies and the payoff 
functions of the players as shown below; 

ty * * 1 ^;; J 1 , ... (2.1) 

where V?~j ... are the players’ strategy sets and. 

J 1 , . . . J N are the payoff functions. Since a selection of 
one strategy each by the players determines an outcome, 
jl, ... can be considered as scalar real-valued functions 
on the product of strategy sets. That is for p = 1> ... N, 

J p : U 1 x » S ... tt? ~ IR- (2.2) 

The game obtained by replacing each J* by the set of all 
probability distributions on is called the Mixed 

Extension of iy and each of the probability distributions is 

called a Mixed Strategy. Against this, a different type of 

• ' ■ ; ' f ! . ' ; ' " • > : . 

random strategy, called a Behaviour Strategy, is defined 



by specifying a probability distribution at every infor- 
mation set into the corresponding alternatives. 

In its normal form, the game loo..es its multi- 
move character and the imperfe-ctness of information to the 
various players. Full knowledge of all the sets u 1 * 
and the functions J 1 , ... J N by a player is regarded as 
complete information to him in this model. Most of the 
solution concepts are studied for Complete Information 
Games in Normal Fbrm. 

The Solution to a Game consists in finding 
Optimal Strategies for each player to minimize his payoff 
function taking into account his and other players' 
information patterns and payoff functions and other 
constraints on communi cation and cooperation between the 
players. The various solution concepts are basically 
different and incomplete. This defect in game theory seems 
to be due to the lack of a game model which includes the 
constraints on communication and cooperation between the 
players. 

Noncooperative Solution of a Game 4 : 

The non cooperative solution of a game in normal 

form is given in terms of its Nash Equilibrium and Minimax 

' ... / ■ ■ '■ ' 

4 We assume complete information In what follows unless 

otherwise specified. , • * ■ , 
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Strategies (Nash 195l> Harsanyi 1969 and Luce and Baiffa 
1957). The game in (2.1) is said to have an Equilibrium 
Point if strategies (U^ - > ... U ) exist such that the 
following hold for p = 1, ... N. 

jPuj 1 *, ... ijP' 1 *, UP, u p+1 *, ... d n *) ? JPCD 1 *. ... u N *> 

(3.3) 

or equivalently 


min J P(£* ; U p ) - J P (U*) 

TjP £ U p 


(2.4) 


where 

(II* ; OP) > (U 1 *, ... DP' 1 *, UP, DP* 1 *, ... D H *) (2.S) 6 
and (I*) - (D 1 *, ... U N *) (3.6) 5 


Thus no player can unilaterally deviate from his equilibrium 
strategy and improve his payoff function. Nash (1950) proved 
the existence of such points in mixed strategies for finite 
games. The equilibrium points of a Two-Person Zero-Sum 
Game are called its Saddle Points. 


In a two-person zero-sum game a player can insure 
himself a Security Level for his payoff by playing a certain 
pure strategy - known as his Minimax Strategy - and his 
antagonist cannot prevent him from doing this even if he 
has full knowledge of the above strategy. Once again this 

5 This notation is used freely hereafter. 
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latter assumption which is made in a worst-case sense is 
outside the structure of the game and if mixed strategies 
are allowed, saddle point strategies are always to be 
preferred by the players to their minimax strategies. In 
a N-persfcn game, minimax strategies have the same signifi- 
cance for any player with the rest of the players acting as 
a combined antagonist. 

Since it is possible to have more than one 
equilibrium point in a general game, extra concepts are 
needed to define the noncooperative solution. Two 
equilibrium points IL* and H are Equivalent if for all 

p = 1, , . . N, we have 

. 

= J p ctl) 0.7) 

;fwo equilibrium points 2* and U are Interchangeable if 
any Recombination U - every is either UP or U*' - 

t’ 1 1 * 1 

is also an equilibrium point. All the saddle points are 
automatically both equivalent and interchangeable in the 

' . "7 

case of two-person ^ero-sum games and hence constitute the 
solution. This being not true in a general N-person Game, 
the solution is defined differently depending upon whether 
the- players are allowed to communicate between themselves 
to decide cn certain equilibrium strategies or not. The 
Vocal Solution (V- Solution) , in which communication is 
allowed, is given either as a set E of equilibrium 



points Safely Equivalent 6 with respect to some admissible 
set A containing E or as a set M of minimax points. 

The Tacit Solution (T- Solution), in which no communication 
is allowed, is given as a set M* of minimax points or as 
a set E* of Interchangeable equilibrium points. 

The interchangeability is essential only for the 
T- solution. Otherwise there will be a coordination problem 
since there is no communi cati on between the players in this 
case. The safe equivalence is a weakened equivalence 
concept which is as follows. The Safe Payoff for player p 
in the set A of equilibrium points, from strategy B? is 
given by 

J p *(u p ; A) * min J P Q.) (2.8) 

Ue A(UP) 

where A(U P ) is the set of those equilibrium points in 4 
A where player p uses U p as his strategy. Two equili- 
brium joints H* and H are Safely Equivalent with respect 
to A if their respective safe payoffs are equal for all the 
players. That is for p = 1, ... N, we have 

j p UT ; a) = JP(t ; a) (2.9) 

The set A in the V- and T- solutions itself is obtained 
by certain reduction procedures applied on the set of all 
equilibrium points in the game based on some Payoff and 

6 To be defined below. 
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Risk Dominance relations. Payoff Dominance shows the 
preferences of the players over the several equilibrium 
strategies and Risk Dominance compares the risks, each of 
the players takes in sticking to a certain equilibrium 
strategy favourable to him against the possibility of the 
other players not adhering to it. These concepts are 
fairly involved to be presented in detail here. The V- 
and T-solutions are given as minimax points if one fails 
to determine the solution in terms of equilibrium points 
by the above procedure. 

Cooperative Solutions; 

The cooperative solution of a game is given by 
its Pareto Optimal Strategies. Under uninhibited communi- 
cation^ between them, the players under cooperation agree 
to play a binding Pareto optimal strategy and enforce it 
mutually. A joint strategy Jj° .is Pareto Optimal if for 
any other strategy H, we have 

|j p (IJ) < only if f J p (H) = J p CU°)l (2.10) 

p — 1,...N j \ p = 1, • • «N j 

The strategy being jointly coordinated,! t implies that the 
information sets corresponding to this strategy are obtained 
by the total information available to all the players put 
together. However, while implementing, each player will 
take recourse to his own information sets. 



1 In the Vocal Solution under noncooperation, the players 
do not strike compromise outside the equilibrium points 
or minimax strategies. 



Expressed mathematically,, the vector 1= W 1 ,—^ 
n only introduce a Partial Orderly on the space of joint 
irate gies and minimality with respect to this ordering is 
ire to optimality, T*. while Hash e 4 uilihrium is a *ah 

/ • +-V 10 strategy is stable for any player 

lability Concept (since the strategy 

gainst his unilateral deviation), Pareto optimality is a 
ea h Optimality Concept in N-person game theory. 

Since in general there are several Pareto Optimal 
itrategies in a game, the cooperative solution involves a 
selection of one such strategy. If the payoffs to the 
rarious players can be compared and the total p y ^ 

redistributed among them, then that Pareto optima 
ihtch minimises their Total Payoff expressed in a common un 
L chosen. The Characteristic PUnction Theory (von eum 
*n d Horgenstern 19 S3 and Luce and Raiffa l^deals with the 
Redistribution P roblem by considering the Security Levels 
of the various Coalitions in the game. 

In games in which a comparison of payoffs cannot he 
ffiade , the Dilemma is resolve, by considering Bargaining 
■betwlen the players with «. initial point as the noncooper- 
ative solution. This solution is justified by Hash (1953) 

as we ii as considering the Threats 
by an axiomatic approach as wen s 

1 nlavers as moves in an overall game. The 
and Demands for the players as m 

noncooperative solution reflects the Optimal Threats while 

the bargaining problem reflects the Optimal Demands. 



25 


2.3 JTO EMULATION OF N-PEBSON DIFFERENTIAL GAMES 

Unlike finite games, Differential Games cannot be 
represented in the form of a game tree since each player 
possesses a continuum of moves and a continuum of alterna- 
tives at each move. Thus the advantage of representing 
imperfect information pictorially by information sets is 
lost. The node becomes the state of the game in this case 
with the first node specified by an Initial Condition and 
the outcomes described by a suitable Terminal Surface. The 
alternatives for any player at each move are specified by 
a Control Restraint Set and the transition occuring because 
of a choice of the alternatives is described by a Different 
tial Equation. Any imperfectness of information to a 
player is represented by an Observation Equation. Thus the 
following is the Extensive Formulation of a Deterministic 
N-Person Differential Game. 

The state of the game, x of dimension n, 
satisfies a vector differential equation 

X - f(x, u, t) C2.ll) 

where 

u= (u 1 , ... u P , ... u N ) (2.12) 

and u p _is the - dimensional control action vector of 
the p^* 1 player and t denotes time. 
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The state of the game is to be transferred by the 

itrol actions of the players from an initial state 

\ (2.13) 

x(t 0 ) = Xq 


a final state contained in a terminal surface of 
mens ion n given by 


(x f , t f ) = 0 
■ parametrically as 

— X( tf ) 5 t^ - T( tf ) 

lere tf ranges over an n- dimensional cube 


(2.14) 


(2.15) 


The p^* 1 player chooses his control action 
over the time interval jjt 0 , t f ] so as to 
Lnimize his payoff functional 

lOo, t 0> uj = 0 p (x f , t f ) + | Q f I P (X, U, t) it (3.16) 

he control action of each player is based on his informa, 
ion of the state of the game specified through his 
bservation equation. 

The p^*k player makes the mP — dimensional (with 
P < n ) observations y p given by 

yP = hPCx, t) (2 ‘ 17) 

n tills context, perfect information to a player implies 
- observations are identical with the state at any 



time. The p^* 1 player chooses his strategy as a .function 
of his information into his control restraint set JlP, i.e., 

vP = U p (y p , t) (2. 18} 

* 

Other assumptions about the smoothness properties 
of the various functions* the region of the state space 
where the game tates place and sofforth will be introduced 
later cn. 


Many new classes of games can be constructed by 
considering that the functions such as f, h p and L P are 
noisy. That is we have 

x = f(x, u, w 1 , t) (2.19) 

yP - h P (x , t) (2.20) 

where w-, and w£ are random disturbance vectors. In this 
x 2 '% ■■ 
case, the players minimize their payoff functions in a 

statistical expectation sense* These are called Stochastic 

Differential Games with imperfect information because of the 

presence of noise terms in (1.19) and (2.20). These and 

other information patterns are considered in recent literature 

(Ragade 1968, Ciletti 196g and Rhodes 1969). 

A player is said ta have complete information if ho 
has full taiowledge about the various functions involved in 
the formulation as well as the statistical information about 
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all the random disturbances. The solution concepts in these 
games will be discussed next. 

2.4 SOLUTION CONCEPTS OF N-PERSON DIFFERENTIAL GAMES 

The solution concepts of N-person differential games 
should essentially be the same as those discussed in 
Section 2.2, since these concepts are defined for the Normal 
Form of the game which does not explicitly include the 
specific constraints in the game. Thus the solution of the 
game itself depends upon the information patterns to the 
players and other constraints on communication and coopera- 
tion between them. The actual normal form of a differential 
game will be obtained in Chapter III, but we discuss the 
implications of the concepts here. 

The game formulated in Section 2.3 in (2. ll)-(2. 18) 

% Vfr,... ' 

is said to have an Equilibrium Poinffif strategies 

-i * ivr* ** v 

(U x , ... U ) exist such that the following holds for 
p = 1, ... N. . 

J p [x 0 > to. !i*l « J P [Xo. t 0 , to*; UP)] ' / 0 . 31 ) - 

The strategies H* and to*. U p ) are further restricted to 
be Playable which means that they assure Termination of the 
game which is necessary for the evaluation of the various 
payoff functionals. 

'' v * * * 

The strategy of any player is to be remembered as 

a function of the information to him. Thus under perfect 
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information a player employs a Closed-loop control law as 
his strategy and under no observations (except the initial 
conditions), he has to implement his strategy as an Open- 
loop control law. Considering a finite multistage game, 
Starr and Ho (i960 b) illustrated specifically that the 
Nash equilibrium solutions are different with these two 
different information patterns, which are the ones 
considered in detail in the following. 

The question of whether the Principle of 
Optimality applying in some form to the Nash equilibrium 
solution has been raised by Starr and Ho (1969 b). It is 
obvious that the Principle of Optimality, that any part of 
the optimal solution or trajectory is optimal between its 
end points, applies equally well here. Pbr players having 
perfect information and hence use closed- loop control laws, 
the Imbedding Principle is valid and along with the 
Principle of Optimality yields direct from the definition 
of the equilibrium point, 

J p [x, t, a*] ^ J p [x, t, 01*; DP a d.33) 

for any x, t by the Dynamic Programming argument. On the 
other hand, for players having no observations, the question 
of multiple moves or stages is only artificial and illusory 
at least to the players concerned and the dynamic program- 

I i| r ■ ■ ' 

ming argument does not arise. Ihe implications of these 
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remarks on the necessary conditions satisfied by these 
strategies will be examined now. 

The Minimum Principle, as is originally stated by 
Pontryagin, can be viewed as the outgrowth of the Hamiltonla-^ 
approach to variational problems and is applicable to 
open-loop control laws. The generalization of this to the 
Nash equilibrium situation as stated by Karvovskiy and 
Kuznetsov (1966) and Case (1967) is thus applicable to games 
with no observations to the players except the initial 
conditions. In contrast to this, Dynamic Programming method 
is a Value- junction approach similar to the Hamilton- Jacobi 
theory and is applicable to closed-loop control laws. The 
generalization of this as stated by Sarma qt. al. (1969) and 
Starr and Ho (1969 a) applies to games with perfect infor- 
mation to the players. Drue derivation of these results will 
be persued in Chapter III. 

The inequality in (2*21) and (2.22) requires a 
simultaneous selection of strategies by the players. 

Variation of the equilibrium concept with a Heirarchical 
Type of Information similar to the Theory of Minimax 
(Danskin 1967) appears in differential games studied by 
Soviet Authors (for example Pontryagin 1966 and (rindes 1967). 
According to this, the flayer chooses his strategy first, 
then knowing this the (N-l) st player and so on down the 
line and finally the first player with the knowledge of all 



the other strategies, all under noncooperation. The 
inequality (2.2l) gets modified as follows 5 


jl I> 0 > t 0 , u 1 , u 2 , .. u N ] ^ J 1 ^, t 0 , u 1 *, u 2 , 
J 2 [X 0 , t 0 , u^, .. u»l * J 2 [x 0 , t, u 1 : u 2 *, 


J C x 0 ’ ^ 0 * ul j u2 ) •• u^] ^ J N [x 0 , t 0 , u 1 , u 


1 * 2 * 


2 * 


• u N ] 

. U N ] 

(2.23) 

U N *] 


The inequality (2.22) also gets modified similarly under the 
same assumptions about the players 1 observations of the 
state of the game. 

H control action \u° i s said to be Pareto Optimal 
if for any other control action u, the following is true, 

0 0 ? ^ C x o » ^0 » 


rt } 


only if 


p = 1, . . . N 

jP l x o» t oy 3il = jP l x o> t 


l 

1 ’) 


t 


o * 


p = 1, 


N 


u°lj 


(2.24) 


Dynamic Programming can be used successfully for obtaining 
E° if such control actions are finite in number (see 
| Zadeh 1963). In continuous- time deterministic problems, 
this is not true and necessary conditions similar to 
Pontryagin* s minimum principle are used for the purpose 
(see Chapter V). Since there is perfect agreement in the 
;ginning between the players, it can be implemented in 
slosed-loop or open- loop depending upon the players r 
observations. 
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Similar to finite games, we assume that the players 
strike cooperation at the Commencement of the Play and decide 
on a Pareto Optimal strategy. In implementing this strategy 
individually, each player should have confidence in the 
others 1 adherence to the agreed strategy. If the player 
has perfect information, his running knowledge of state 
enables him to detect about the departure of other players, 
the matter being particularly simple in the case of two- 
player Games. 

We will not consider the delaying tactics, if any, 
of the players in postponing cooperation to a later stage 
(see for example Lawser and Volz 1969 ). We believe that 
such can be discussed in a model which allows the Coopera- 
tion and Communication between the players as moves in the 
model. 


2.5 CONCLUSIONS 

A general class of N-person Differential Games are 
formulated in this chapter. The solution concepts of 
finite games seem applicable to Differential Games as well. 
The concepts of closed-loop and open- loop control laws in 
control theory are applicable mainly to Two Different 
Information Patterns, perfect and null information 
rji'jpectively to the players. The two main approaches for 
.ving the necessary conditions - Pontryagin* s minimum 



principle and Dynamic Programming - are thus applicable 
to these cases respectively. This is the subject matter 
of the next chapter. 

We restrict our attention to Pure Strategies only 
in this thesis because of the difficulties in implementing 
mixed strategies. Behaviour strategies are easier to 
implement, being associated with the information sets and 
we feel they have a lot of importance in Stochastic 
Differential Games. 



CHAPTER III 


WECESSABK CONDI HONS FOR NONCOOPERATIVE SOLUTION 

3.1 INTRODUCTION 

In the last chapter we saw that the concept of 
Nash Equilibrium is central to the noncooperative solution 
of a game. In general, for any problem which is fairly 
complex, one has to resort to the application of suitable 
Necessary Conditions for the determination of the 
equilibrium strategies. Alternate approaches are possible 
for simpler problems. For example, Petrosyan (1965) reduces 

j# 1 :. ' 

a multi-pursuer multi-evader game under noncooperation into 
an Integer Programming Problem by considering the component 
games with multiple pursuers and single evader and single 
pursuer and multiple evaders. 

In this chapter, we shall obtain a Modified 
Minimum Principle for the equilibrium strategies by the 
application of Dynamic Programming, a rigorous version of 
which appears in (Sarma et. al. 1969). For the class of 
games studied in this chapter, we assume Complete and 
Perfect Information to the players and the existence of a 
unique equilibrium point in pure strategies over the 
region of interest except for starting points whose 
^measure is zero*. Surfaces containing Atnormal and 
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Singular Solutions and other switching surfaces will he 
studied in the next chapter. 

As it has become customary to study linear problems 
with quadratic cost functionals (see for example Bellman 1967 ? 
Ho et. al. 1965, Starr and Ho 1969 a and Rhodes 1969), we 
shall consider one such problem. Such problems provide good 
insight and are amenable for analytical manipulations. The 
second example we consider is the noncooperative solution of 
a Double Integral Plant with time and fuel minimization. 

3.2 NECESSAKf CONDITIONS FOR NONCOOP ERATIVE SOLUTION 

Before proceeding to the derivation of the necessary 
condi ti ms, we construct the Normal Form of the deterministic 
N-person differential game with perfect information to all 
the players formulated in Section 2.3. The relevant 
equations are reproduced here since frequent reference is 
made to them in this chapter. 


The game satisfies the state equation 

* 

± =? f(x, u, t) 


(3.1) 


where x and u p are of dimensions n and r^ respec- 
tively. The region of interest in the state-time space in 
which the game takes, place is known as the Playing 


1 We use the underbar notation 
variables related to all the 

Thus u = (u^ - , . . uP, .. uN). 

- : v ■ . " 

A':? : .‘.A.-: ■ 

, ft;# , , A? ‘J U Al V i li . ? s£ 


similar quantities or 
rs are put together. 



Space and the terminal surface *J is part of its boundary 
made up of the union of smooth surfaces 

= U 7l (3.2) 

i=l 

Each CTj_ is given by the equations 

t f=T i j i (<5') 5 x f =X iji (<r) (3.3) 2 

The players choose strategies as functions of state 

and time satisfying the control variable constraints, i.e., 

u p = TJ p (x, t) (3.4) 

such that 

u p ey\?(x, t) (3.5) 

where -TvP is a restraint set given by an inequality of 
dimension l p such as, 

K p (x, u p , t) 0 (3.6) 

The p^* 1 player minimizes his payoff functional 

J p [^ ,a * 0 P (x f ,t f ) + ; tf L p (x,iI(x,t),t) dt (3.7 ) 3 

We assume that the functions L p , f and K p and 
their partial derivatives with respect to x are continuous 
in their arguments, i.e. , they are of class Similarly 

each Tj_j^ and and the function 2f p are of class 

on each *7'%,'* 

- - 

2 The subscript is associated with a Regular Decompo- 

sition to be introduced belbw. 

3 Here 0 p can be expressed as a function of <f in view 
of (3.3). 
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Normal Pbrm of the Game; 

Let 2 P be the class of all functions satisfying 
(3.4)-(3,6) which are piecewise continuous with piecewise 
continuous derivatives, i.e., they belong to the class 
piecewise C^. Pbr any U p £ 2 P , the solution to (3.1) 
called paths may Bifurcate or Coalesce at the points of 
discontinuity of one or more U p . Otherwise the solution 
will be unique. 

We shall say that U p £ 2 P , p = 1, ... N, or 
2. £ £ is a Playable N- tuple if for each ( , X ) £ (R, 

every solution stays in !R and reaches the terminal surface 
in finite time. Thus playability is Joint Controllability 
of the state of the game and the payoff J p can be multi- 
valued because of the bifurcation in paths cited earlier. We 
consider maximum nonvoid subclasses 'll? C Z p , such that 
TjP £ *U?» p = 1, ... N is a playable N-tuple of strategies. 
This A Normal Pbrm of the game is given by 

{u 1 , ... u N ; J 1 * ••• 3 s \ 0.8) 

with ... as the pure strategies of the respective 

players. 


Assumptions cn Optimal Paths: 

Let 2* be the noncooperative solution in terms of 

equilibrium points for the game (3.8). Pbr the class of 


games considered here 2. 


if there is more 



38 


than one path starting from a point ( , % ) £ because 

of discontinuity in the strategies of some players? then the 
payoffs to These Players are independent of the various 
paths and by assumption, we have 

(iT ; uP)j ^ J p [t,r, a*l 0.9) 

= wPC^.t ) 

where is called the Value Function of the 

noncooperative game. 


We make the following assumptions on U* and the 
associated solutions x* to (3.1) called the optimal paths. 

(i) I£* exists 

(ii) The decomposition associated with U* is Regular (see 

Figure 3.1). This consists of disjoint open subregions 
^ij j j = 1, ... jjL and the switching surfaces and 

vl ^i l'i • • • i k* 

The manifold separates the subregions 

and Although the manifold arises out of the use 

of discontinuous strategies by some of the players, the 
Value Function $ is continuous across <M±y The manifold 

w At 


13 


can be expressed as 


t = Ti^ ( <f ) ; x = j ( <s ) 


(3.10) 


where <f ranges over an n-dimensional cube. The union of 


'•' S '.' 1 


t/f . . for all i forms the Terminal Surface 7 given by 
iOi 

(3.2) and (3.3) (see Footnote 2). 
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The manifold 
the subregions & , * * 


i k 


is the intersection of 


For starting points on this 

1 " “ x k 

manifold there may be multiple optimal paths arising out of 
the discontinuities in the strategies of some players. The 

4 

Value Functions of Only These Players are continuous across 
the manifold. 


(iii) If the initial point is interior to one of the 

subregions then the optimal trajectory x* is unique. 

(iv) The optimal paths are never tangential to any of the 
switching manifolds or the terminal surface. 


Further properties of the optimal paths x* are 
given in (Berkovitz 1967). The above assumptions together 
with the n-dimensionality of the terminal surface make the 
paths Normal (Berkovitz I96l) and the Abnormal and Singular 
surfaces and surfaces containing Perpetuated Dilemma to the 
players (Isaacs 1966) are ruled out in the present context. 


Hamilton- Jacobi” Bellman Equations: 

We shall now derive the Hamilton-Jacobi-Bellman 
equations satisfied by the Value Function $ by the appli- 
cation of Dynamic Programming. For this, the game is 
considered, as viewed by each of the players when all he 

4 The Value Functions of the players using continuous 
strategies need not be continuous. This arises out of 

the nonequivalence of the equilibrium points and will 
be pursued in Chapter IV. 
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bhows is that the rest have perhaps chosen their noncoopera- 
tive optimal strategies. 


Let (^ ,'T) be a point of Then for the 

pth player, 

ti-s jj_“l tj ic+i 

W P (^ ,TT ) = 0 p (x f ,t f ) +T J + 2 X l L p * dt (3.11) 

X t ik J 


where 


L p * = L p (x*, IL*(x*, t), t) 


(3.12) 


Thus from the properties of the optimal paths (Berkovitz 1967 )> 

5 

it follows that 1/^, exist and are continuous on 

with unique one-sided limits onto ./t^ ^ -i ^ 


U P (x,t) = -( n „ 


To establish the partial differential equation 
satisfied by W p ( f ,T ), we consider the particular 
nonop timal strategy for the p^* 1 player, 

fu p (x,t) , (x,t)eH({,r) 

Lu p *(x,t) , (x,t)N({,t) 

where U p £ TA? and N( f ,T ) is a neighbourhood of (^,'T) 
wholly contained in It can be shown that U p eu p and 

let 6 stand for the last time the trajectory leaves N(^ , T ). 


(3.13) 


Now since & is equilibrium optimal 


W p (-f ,r) = J P (<£ ,T, £*) 4 J P (i >¥> «L* 5 t p )) (3.14) 


5 Variable subscripts indicate partial derivatives in 
the following. 
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The right-hand side of the inequality (3.14) can be 
expanded as 

r+6 tf vv 

0 p (x_,t f ) + ( I + S ) L(x, (U* ; U p ), t) dt 
r T t+6 

nr +5 ^ 

= ! L p (x, (H* ; U p ), t) dt + W p (x(T + 6),T + 5) (3.15) 

T 

Hence in view of (3.13), we have 

-«PCxCT«),W)^(i.*r) ^ L p Cx,(!i*iD p ),t)at 0.16) 
with the equality holding for U p = U p *. 

We shall now let N(£ ,% ) ■* (•£>**•) which implies 
that 6 -*■ 0. Since W p and W p are continuous an we 

can apply the mean value theorem to the inequality (3*14) and 
write the left-hand side as 

-W P (-| ,X )6 - Vp ( | , ) [x( ar +5 ) -f} + o (6 ) 

= -w^(| ,r ,r )f(£ »(lL*(t >x );U P (^ ,r )),'f)6+o(a) 

f (3.17) 

Similarly the right-hand side can be written as 

L p (i ,T );UJ ? (f ,r )),r)5+o(5) and in the limit tte 

inequality (3.16) reduces to 

-</($,»> ^X. p (|, ClTCi ,T ) iU p (.i ,T ) ) ,T) 

+U P (5,T)f(f, CD ) ;U P ($ ,T) ) ,T ) ( 3 . 18) 

Since U p is arbitrary, (3.18) holds for 8 < IL P 
with the equality holding for U p = U p *. It can also be 
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written in the following convenient forms. 

= L p Cf,lL*(f ,r),T) + ^(f ,r)f(f ,u*($,r)»r) 

. = HP(|,Vl|(f,T), U*(5,r) } r) 

= min n HP(5,w p (f,r),(u*(f,r);U p (^ s T)) 3 ^) (3.19) 

up eu p * 

= S ln L p (f UP(|,r)),r) 

u p e up 

+ hi i$ >r )f (f > (2.* Of /r) ;U P (f ,r) ) ,*r) 
where the Hamiltonian Function HP is defined as 

H p (x,A,u, t) = A p L p (x,u,t) +A P *f(x,u,t) (3.20) 

In this chapter, it is invariably assumed that A p is 
equal to unity . 

Equation (3,19) is termed the Hamilton- Jacobi- Be liman 
equation and holds for p = 1, N. * 

The Minimum Principle for the Players: 

The necessary conditions obtained above can be 
expressed in the Hamiltonian Pbrm by introducing adjoint 
variables or Lagrange multipliers and relating them to 
and Tab- 
let (f ,t) be a point in \R.±y W& consider the 
following linear differential equation with the final condition 

A p Ct f ) =A £, (r ljl «)> ,»AS 1 s 

6 Abnormal paths* on which = 0, will be discussed 
in the next chapter. 
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A P = - ( H** + | H p i U^* ) (3.21) 

where H p is given by (3.20) and the star notation is usud 
as in (3.12) to indicate that the arguments are in terms of 
variables related to the optimal paths. Also let the 
components of ^ be given by the following system of 
linear equations. 


Pn*P ^ tP 






A P r ^/ \ ^ Ti ^i ^ x iJi 1 - a 

+ A 3i £ f0r i 3l > -ar-Ttf ] 

(3.22) 


where indicates that the arguments correspond to the 

terminal surface. Equation (3.2l) defines A J as continuous 
function of (f,x) on (R ii and the solution a P to (3.2l) 


■t 

is also a continuous function of and t 

standard theorems in differential equations. 


on 


by 


Now we define the comer conditions which determine 
the left-hand limits A P from the right-hand limits 
at the manifolds. 

- xfwo - ’#> »•*» 


where A P * A p+ * 1 and tt.” and t r+ indicate that the 

o o Ik ik, 

arguments of the functions are appropriate one-sided limits 

at 



Thus the solution of (3. 2l) is defined and 

continuous f*r (|,r) in with unique one-sided limits 

on and satisfies the transversality and comer condi- 

tions (3.22) and (3.23), which can be stated in a more compact 
form as follows : 


*o p £ + hP(t i ITT = 


0 


(3.24) 


[H P (^ k > - - ( A P+ - A?') ^ = 0 (3.25) 


Now for the Value Function , we can write for 


any ( % , X ) in X 


i i 


+ a i3 + 2 ; tl,k+ 1 )(L 5 *+ 2 l p £ n < *)x*at (3.26) 


Ik 


k-j t ik 


e u 


The terms involving integrals can be rewritten using (3.21) 
as follows * 

(j-^VV 1 -A P ^ - 2 A P 0 U«*)x*dt (3.27) 

A £ u. X £ 


r t 


ik 


Further, in view of the system equation (3.1), on the optimal 
path x*, we have 



46 


and (3.26) can be simplified as 
3i~l ^i>k + l 

- (X J + 2 J ) d(>&"5 
r k= 3 ^ik 


(3*29) 


Now, from (3.29), (3.26), (3.22) and (3.23), it follows 
that 

wP(f ,t) = X p (f ,*,r) (3.30) 

and 

(3.31) 


l/(x,t) => P (f,r,t) a X p (x,t,t) 

JL 

Thus we can write (3.19) as follows; 

- ^Ujt) = H p (x*,x P »3i*>^) 
x 


(3.32) 


= min HP(x ,x p »(u. ;u p ),t) 
u p 

where u p = U p (x*,t) for some U p G * Equation (3.32) 
holds for all the players, i.e., p = l, ...N and implies 
that at any point (x,t) in SL the game J^/(x,t) with 
payoffs defined by H p (x, x P >li, t) has a pure strategy 
equilibrium point u*. The value of the game is 

[ H x (x, 1 ,u*,t), ... H^Cx, N ,u*,t) ] - -W t (x,t) (3.33) 

Let the function.- K p (x,u p ,t) in (3.6) which 

define the restraint sets ^_ p (x,t) satisfy the constraint 
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conditions that if i p ^ rP, then at each point (x,u£,t) 
at most components of K p can vanish and the matrix 


I | formed from K p , the vanishing components of K p , 

has maximum rank, this being true for p = l, ... N. Then 
there exist functions such that the following hold 

( Berkovi tz 1967 ) . 





0 


(3.34) 


pP ^ 0 (3.35) 

K P = 0 (3.36) 

Thus when u p is interior to its restraint set i.e., 

K p > 0, (3.36) yields that juP = 0. Thus condition (3.34) 

reduces to an important form 

H P = 0 (3.37) 

uP 

Equation (3.37) is also true when u p is unconstrained, 
i.e., when (3.6) is absent. 

3.3 FURTHER RESULTS PERTAINING JD THE EQUILIBRIUM CONCEPT 

Ta fe obtained in Section 3.2 the necessary conditions 
for equilibrium strategies based on the results of Berkovitz 
(1967). Alternately, one can pose an optimal control problem 
for each player against the equilibrium play of the rest of 
the players. The equivalenpe of the necessary conditions 



48 


for all these problems and those for the equilibrium point 
of the normal form of the game as constructed in Section 3.2, 
is shown by Berkovitz (1964). The following results are 
stated keeping this equivalence in mind. 

Legendre-Clebsch Condition: 

At any point on an optimal path excluding corners, 
if the vector formed from K p by taking those 

components that vanish at that point, then for all e p 
satisfying l£Pp.e p = 0, it follows that 

e P ((H p + jj? K p )^ p ) e P ^ 0 (3.38) 

If u p is interior to AP or u p is unconstrained, then 
jjp becomes a null vector by (3.36) and (3.38) reduces to 
the easier and classical form viz. for all e p , 

e P ( H P u p u p ) e p ^ 0 (3.39) 

or that H^ d d is positive semidefinite. 
u*u F 

If on an extremal trajectory excluding corners, 

(3.39) is satisfied with strict inequality, i.e., eP is 

u p u p 

positive definite, then the Legendre-Clebsch condition is 

said to be satisfied in the strengthened form. If cm the 

other hand, Hp_ _ is positive semidefinite, the presence 
uPuP 

of Singular Control Variables is indicated and these are 
discussed in the next chapter. 
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Null-Observation Games: 

We observed in Chapter II, that a player p with 
no observations implements his strategy U P as an open-loop 
law, i.e., his control action is given by 

u P = U P ( x Q , t 0 , t ) (3.40) 

Though there is no use defining a value function for this 
player, Pontryagin 1 s minimum principle holds for his one- 
sided optimal control problem. Thus the necessary conditions 
of Section 3.2 hold with the difference that IJ P will be 

.A. 

3ero. Therefore if ncne of the players have any observations, 
the adjoint equations (3.2l)will have the form (Case 1967 and 
Starr and Ho 1969) 

A P = - If (3.41) 

Powerful mathematical tools like Functional Analysis 
which are mainly applicable to open- loop controls (Gindes 1967 
and Kirillova 19®?) are applicable to some of the problems of 
this category. 

Sufficient Conditions: 

The sufficiency conditions in the literature of 
optimal control and variational calculus are based either 
on the Value function Approach or. on the Conjugate Point 
Condition. The Value function method is primarily 
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applicable for the closed-loop control laws and the 
conjugate point method for the open- loop control laws. 

Thus we state two simple sufficient conditions below. 

For the class of perfect information games, studied 
in this chapter, i.e., having no Abnormal and Singular 
solutions, if the strategies satisfy the following Hamilton- 
Jacobi equations with their resulting Values, then the 
strategics are optimal. 

- V?(x,t) * ml* H(x, wjj*(x, t) , (u*; uP),t) (3.42) 

uP£xvP 

It is to be noted that this condition is glooal and is a 
stronger requirement than that in (3.19) and (3.32). 

If on an extremal trajectory of a null -observation 
game, the strengthened Legend re-Clebsch condition is satisfied, 
then a necessary and sufficient condition that any player’s 
strategy is optimal is that there be no conjugate points for 
the Accessory Minimization Problem of this player which will 
be a linear-quadratic-optimal-control problem* From the 
results in (Breakwell, and Ho 1965 and Schmitendorf and 
Citron 1969), it follows that the Riecati equations of the 
players in the Accessory Game (which is linear- quadra tic once 
again) should have bounded solutions* 

3.4 EXAMPLES 

"Wb consider two examples in this section to 
illustrate the application of the results in Sections 3.2 
and 3*3. The first is a game with linear dynamics and 



quadratic payoff functionals studied by Starr and Ho (1969) 
and Rhodes (1969) in which the optimal strategies are conti- 
nuous. The second example involves time and fuel minimi za--i on 
of a double integral plant emphasizing the presence of 
discontinuous optimal strategies. We dwell on this example 
at considerable length in the later chapters. 

Example 3.1 : 

The state n f the game, x of dimension n, 
satisfies the linear differential equation 

N n 

x = A(t) x(t) + 2 F(t) u p (t) (3.43) 

p=l 

where u p , the unconstrained control action vector of the 
player p, is Of dimension r^. The matrices A(t) and 
B p (t) are of dimensions n x n and n x r p respectively. 

The payoff functional of the p th player is given 
by 

tf 

J p |x 0 ,t 0 ,u| - x f FPx f + | |x T (t)Q P (t)x(t) 

° N -T n ■! 7 

+ 2 u^ (t)RP(t)u J (t)| dt (3.44) 

j=l J 

where Xq is the initial state at time t Q and the final 
state X|>, at the specified t^, is arbitrary. The 
matrices qP C t ) and R?(t) for P>3 = lj ... N are 


7 


Transpose Notation is used only in this example because 
of the familiarity of the results for these problems in 


this form 



1. 1. T. KANPUR 
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symmetric matrices of proper dimension. The matrix 

Jr 

is assumed to be positive definite. 

The game with perfect information to all the players 
can be solved by the Value Function approach. Assuming 
l»PCx,t) to be of the form 

wP(x,t) = x T (t) S p (t) x(t) (3.45) 

we have from the Hamilton-Jacobi equation in any of the forms 
(3.19), (3.32) or (3.42), 

x T (t)S P (t)x(t) = min(2<S p (t)x(t) ,A(t)x(t) + 2B^(t)u^(t)> 

uP j 

+ • X T (t)Q P (t)x(t) + •Su ;3T (t)R P (t)u ;5 (t51 

'j 3 

(3.46) 

or solving (3.46) 

u p (t) = -fP 1 (t) BP T (t) S p (t) x(t) (3.47) 

Substituting (3.47) into (3.46), we have after dropping the 
argument t, 

S p = -sPk-k T S P -qP- Z ( B^ fI ~ 1 R? W 

j 0 J J J 

- S^R^B^S 5 ) (3.48) 

m J 

with the final condition 

S p (t f ) * FP (3.49) 

Equations (3.47 )-(3.49) for p = 1, ... N is the noncoope- 
rative solution of the game. 



j.n« game wj.uu an o uservati ons to all the players can 


be solved by posing the eqiivalent tracking problems to the 
various players as follows. Ws define x°, x p , ^P and ^P 
for p « i, . . . n as 

x n = A(t) x°(t) *, x n (t Q ) = Xq (3.50) 

x p « A(t) x P (t) + B^t) u P (t) ; x P (t Q ) = 0 (3.51) 

% P * x° + X r (3.52) 

-S P =* - 2 X* 1 (3.53) 

Now it is obvious that 

X * x° + x 1 + ... x N * - -§ P (3.54) 


Pbr the pt* 1 player, thus we have to minimize 

* s\ f-ff, + HuP|f. 

r r t„ g p r 


+ 2 ll“ 3 ll D at 

J/P RP 


(3.55) 


8 


Since the last term in the integral of (3.55) is outside the 
choice of the p^ player, the solution to the above problem 
can be obtained as the solution of the following tracking 
problem. 

P, v 

Determine u (t) to minimize 

J X P ,t 0 ,u^ .«{$ - ♦ ; tf iH’(t) - <W p 


+ l|u P (t)|| 2 

R p (t) 


(t) 

dt (3.56) 


5 Fbr convenience Norm notation is used here to indicate 
the Cuadratic Jbrms. 
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Example 3.2 

The state of the game satisfies the differential 

equations 



1 2 
Xp = u + c u 


(3,62) 


where the control variables of the two players, u'*' and 

p 

u respectively are constrained as followsJ 




(3.63) 


The players 1 and 2 choose their control action so 
as to minimize respectively their payoff functionals 


^[x-oju^u 2 ] = J f dt 
Q O 

^[x^u 1 ,!! 2 ] = I f t 1 u 1 ! + b|u 2 j 2 dt 


(3.64) 


o 

while driving the state of the game to the origin from an 
arbitrary initial point x 0 in the state space at time 
t = 0 to the origin, I. e. , x- L (t f .) = Xp(t f ) = 0, where 
t f is free. 


The equations (3.62) will be referred to hereafter 
as the Double Integral Plant because of their form. The 
full significance of the example will be discussed in the 
later chapters. Here we confine ourselves to the 
noncooperative solution for the case c > b to illustrate 
the theory developed in this chapter. 



The application of the necessary conditions- of 
Section 3-. 2 yields the following 


hHxjUjX 1 ) = 1 + + AgCu 1 + cu 2 ) 

I^CxjU, A 2 ) = lu 1 ! + bju 2 | + A 2 x 2 +A|(u 1 + cu 2 ) 


u 1 = - sgn A^ 
u 2 = - dez(A 2 c/b) 


where we define 


sgn y 


f+l 

.-1 


y >o 
y < o 


(3.65) 


(3.66) 


(3* ,67 ) 


"+1 y > 1 

dez y * ^ 0 -1 < y < 1 

w-1 y < -1 


( 3 . 68 ) 


The adjoint equations for the perfect information case are 
given in terms of U 1 *, U 2 *> the optimal strategies of the 
players as 


U = .A x o 22f* 

u. 2 d 


(3.69) 


A 2 = — A 2 — (sgn u^-) 

A i 2 ax7 ^ x i 




(3.70) 



The- terms for p, j = 1, 2 appearing in 

0X j 

(3.69) and (3.70) are absent for the null observation case. 
However even in the perfect information case, since the 
value of u 1* is ±1 and that of u 2 * is +1 or 0 
because of (3.66)-(3.68) , the tc-rms for p, j = 1,2 

equal zero and thus the adjoint equations will be the 
same and are as follows for both the cases. 

° 

C3.71) 

a? --a; 

for p = 1, 2. 

We shall construct the solution by integrating 
backwards in time the canonical equations (3.62) and (3.7 1 ) 
using (3.66) starting at the terminal surface, with the 
transversality condition satisfied thereon. At corners, we 
construct the switching surfaces and continue the procedure 
after satisfying the corners conditions at the switching 
surfaces 9 . Alternatively the switching surface is assumed 
as a new terminal surface and the transversality condition 
satisfied and the procedure continued. The method is 
similar to that followed by Isaacs (1966). 

Since the terminal specification violates the 
dimensionality requirement, an- artifice is resorted to 

9 We name the switching surfaces obtained this way as the 
Transition Surfaces in the next chapter. 




by choosing a new terminal surface given by 


Xl ( t ) = q cos 0 
x s (tf ) = 5 sin 9 


(3.72) 


Then we apply the transversality conditions (3.24) with 
CT = ( t^jG ) 

1*5 sin 0 -c sin 

[>^(t f ) A|(t f )] = [-1 03 (3.73) 

_u^“ + cu^ g cos 9_ 

[5 sin 9 - 5 sin 

P£ct f > A®(t f O == [-£|u 1 | + b|u 8 |$ 03 

Lu 1 + CU" K cos 0 J 

(3.74) 


Solving (3,73) and (3.74) and letting 5 . -* 0, we obtain 


A^(t f ) * A£(t f ) ctn e, p = 1,2 


(3.75) 


A|(t f ) * jju^+bju^AgUf) * (3.76) 


The Nash equilibrium terminal sequences consistent 
with (3.76), (3.7l) and ( 3 . 66 ) are ^ 2 "j = j^j. 


5b r example, if we assume that 


u 1 « u 2 * -1 


(3.77) 


we have from (3.76) 


A| ( t f ) * msi * 

2 f (l+c) 


(3.78) 
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and since 1±£ > b in view of c > b, we get from (3.66) 
1+c c ' 

2 

u 1 = -sgn>|(t f ) = -1 ; u 2 = -dez = -1 

(3.79) 

Equations (3.79) and (3.77) are .thus consistent. 

If on the other hand, we assume that 

u 1 = -1 •, u 2 = 0 (3.80) 


we get from (3.76) 

A i Ct f 5 = * 1 > I 

which yields 

u 1 (t f ) = -sgnX|(t f ) = -1 
u^( tp) = -’dez X^(tf ) = -1 
Equations (3.80) and (3.82) are contradictory. 


(3.81) 


(3.82) 


From (A. 3) of Appendix A, we have the equation of 
the switching curve along which the state reaches origin with 
control law u 1 = u 2 = -1 as 

2 

' ^11 = ^ s i ,s 2> : s 2 > 0 » S 1 = “ 2( 1+c) \ ^ 3 * 83) 

At this point, we can either apply the corner 
condition^ at this switching surface (3.83) or consider ^ . 
as a new terminal surface and apply the transversality 
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is a Nash equilibrium sequence. During the 


conditions. The second method is followed here and the first 

method at the next corner. In essence we show that 
+1 +1 -1 
L+l 0 -li 

course of the proof Figures 3.1 and 3.2 may be referred. 

Figure 3.1 shows the typical plots of Ao , A 2 , u 1 and u 2 

for this sequence and Figure 3.2 shows the switching surfaces 

constructed. The state of the game at times t and t« 

2 3 

is denoted by (r^, respectively. 

Since is assumed as the new terminal surface, 

the final state is (s^, s 2 )- corresponding to the new final 
time tg. Now, from (a. 6)> we have 


S^J Sg) = tjP - tg = —2 


1+c 


0 J ( S 1> S 2 ) = (l+b)(t f - t 3 ) = 


(l+b) sp 


1+c 


2 


S 1 = 


2( 1+c ) 


(3.84) 


Applying the transversality conditions (3.24) with 
= (sgjtf), we have 


1+c 
1 + 


1 / ^2 
1 1 


140 > ' I = 0 


1 

l s 2 + 


I±Jb _ 2/ 
i+c r 


^(u 1 + CU 2 ) = 0 

f2_ ) _ 2 

1+c 2 

2 2, 


(3.85) 


\u L \ +-b|u I + x s 2 + 2 (u + cu J ) = 0 
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The quantities in (3.85) refer to time t„-i Solving (3.85) 

o 

to be consistent with (3.7i) and (3.66) yields 


= ” l/sj> 

^g(tg-) = 0 


t 3 - ) 


2+c+b 

(2+c)s 2 



(3.86) 


and that 

uHt) * 1 ; u 2 (t) * 0 (3.87) 

for t in the interval (t 2 ,t 3 ). 



Integrating (3*62) with control law (3.87) for time 
(tj - t 2 ) given by (3.88) 


s 2 * r 2 + (t 3 - t 2 ) 

2 

s x « r x * r 2 (t 3 - tg) + | (t 3 - tg) 


(3.89) 


Solving (3.88), (3.89) and (3.83) together for the equation 
of the switching curve, we have 
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r 


2 


(o-b) (2+c) 
c(2+b+c) Sg 



2 

2 



(3.90) 


where 

(3+b+-c) 2 c 2 +4(i+ c) 2 (2+c)(c-b)b+4(i+c) 3 b 2 

(l+c)(2+c)^(c-b) 2 


(3.91) 


Procn (3,86) and (3.87) we have at this corner , 


■^g(t 2 +) * “ tg) (3.92) 


M 

X*(t 2+ ) * 


■■IMlto) 

c(2+b+c) 

■JEtfets. 

(2¥ c ) s 2 



(3.92) 


Applying the corner conditions (3.25) here, we get with 

€ x (Jfg, tg) 


c^v 3 + igH- ° a 'a > + t&v> + .iSSw = 0 

1 + A^(t2*)r 2 + + cu 2 ) = 0 

p?cv> + S: ]( ' **•' + + Si - 0 

v,.- 8 

Ju 1 ! bju 2 j + A 3 (t 2 -)r 2 * A^t^-JCu 1 + cu 2 ) = 0 


(3.93) 
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Solving (3,93), (3,7 1) and (3.66) consistently, we have 
ukt) * u 2 (t) = +i for t < to 


( 3 . 94 ) 


for t < tg. 

r-JL -i 

Similarly the sequence 

L-l 0 


can be shown 


•1 -1 +l 

+ 1 . 

to be equilibrium optimal. The Noncooperative Optimal 
Control Law is summerized below according to the demarcation 
of the state-space in Figure 3.3, since the equilibrium 
sequences obtained above are unique for any starting point 
and the optimal paths cover the entire state space. 

The switching surfaces "i ^ and V are defined as 


*ii * \ * *x = - \(lo? ] 


(3,95) 


(3.96) 


(3.97) 


10 


P a ^(x^Xg) l * - | X^ sgn Xg*^ 

where * is given by (3.91) and 

(x 1 »x 2 ) 6 G^ (u^u 2 ) * ( 1, 0) 

(X^Xg) 6 Gg (U 1 ^ 2 ) = ("I, 0) 

(Xi»Xg) G (u 1 ^ 2 ) = ( 1, 1) 

Ui,x 2 ) e G 4 u>Tii, c^V 2 ) = 

It should be noted that for the null information 
case, the optimal strategies are only functions of time 
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and the synthesis problem is solved for the perfect 
information case where the strategies are closed-loop 
control laws. The above solution can indeed be shown to 
be valid for the perfect information case by verifying the 
Hamilton-J acobi equation (3.42). 

The Noncooperative Value Junction are 

calculated Using the general results (B.9) of Appendix B 
and tabulated in Table 3.1. 

The Hamiltnn-Jacobi equation (3.42) is verified 
below for the Region 0 , 

Bor player 1> we have 


min In 1 24c x 2 
[u 1 ! <lL Ji4c 


rg: 

v| *2 


(1 - 


2x. 


n 1+0 Jxg-Bx^ ^ 

(3.98) 


Now since 


\f§4C Xg » J(l4c)x| 4 x| 


< v ( 1+C )Xg - ^XjC 14c) 



14C v Jx 2 -2x 1 


It follows that 


xl * sgn 


t’P" jfe)' 


41 


(3.99) 


(3.100) 



Value Function (¥^>w ) of Noncooperative Solution 



(1+c) xf+Sxnd+c) 
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Similarly for player 2, we have 


0 


min f lu^-l+bju 2 !- (2+c+b) x 2 

lu 2 |<li jca+cJCl^Cx 2 -^) 



(2+c+b)x2 


JC2+c)(l+c)(x‘|-2x 1 )i 


\ Cul+ 


cu 2 ) 


( 3 * 101 ) 


Since 

(2+c+b)xg 

J ( 2+c ) ( 1+c ) (x|-2x 1 ) 


( 2+c+b) J( I-t-c ) (x|-2x^) 
( 2+ c ) Ji 1+ c ) ( x|- 2x 1 ) 


-1 


it follows that 


* < b 

c+2 c 


(3.102) 


2 

u 


a= dez 


o 


(2+c+b)x2 


^ (2+c)(l+c)(xj 




0 


(3.103) 


Similarly it has been verified for the other 
regions. This example is solved in detail so that in 
future, we can relegate the unnecessary details t^ the 
appendices. 


3.5 CONCLUSIONS 

In this chapter,, necessary and some sufficient 
conditions are given for a class of deterministic N-person 
differential games. The class of games is assumed not to 
exhibit singular and abnormal solutions. The application 
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of these conditions is shown through solving two examples* 
The second example has a much deeper significance and for 
tho complete solution of this example^ we should relax the 
above restriction which is the main theme of the next 
chapter. 

The only types of information patterns assumed 
for the players are perfect and no observations. Though 
we continue with this assumption, we believe that 
observability concept plays an important role in the case 
of partial observations to the players, which means that 
their observation vectors are of dimension less than that 
of the state of the game; 

We will have occasion to come across more than 
one equilibrium points in the next chapter. The noncoope- 
rative solution is to be defined in this case through the 
extra concepts given in Chapter II. 



CHAPTER IV 

SWI rCHIHC SURFACES IN HI FFEREH H AL GAMES 
4.1 Irt TKDDUGTION 

Ixi the preceding chapter we considered a class 
of noncooperative differential games. In. these the 
players were permitted to use discontinuous strategies 
and the discontinuities in their optimal strategies were 
assumed to lie on certain surfaces in the playing space 
5\.for the game . Further we imposed the condition that 
the game does not exhibit Singular and Abnormal solutions. 
The present chapter is devoted to a study of the surfaces 
containing such solutions* These) as well as the 
surfaces containing the discontinuities in the optimal 
strategies of the players, will be referred to as 
switching surfaces hereafter. 

Since the strategy of any player with perfect 
information is a feedback control law, the solution of a 
game requires (loosely speaking) the compulsory solution 
of the synthesis problem for all the players with 
perfect information which is only optional in the case 
of optimal control problems. Perhaps one could also 
solve the synthesis problem for other players^- for 

1 The misleading term T optimal open-loop feedback control 1 
is used by some authors for this solution for players 
with no observations. 
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convenience in representing their control laws and the 
optimal paths in the playing space of the game. The 
construction of switching surfaces hold in either case 
with proper interpretation. 

The solution of the game between the switching 
surfaces is obtained in a routine fashion by integrating 
the canonical equations obtained by the minimum principle 
stated in Chapter III. This is referred to as the 
’solution in the small* (Isaacs 1965) in contrast to the 
* solution in the large' consisting of the construction of 
the switching surfaces. This latter aspect is on an 
uneasy terrain in the literature and is mostly example- 
oriented (creakwell 1969). 

A ^ene ral classification of the switching 
surfaces in optimal control and differential games is 
presented in Section 4.2. Next we deal with the 
conditions to be satisfied on them for their construction, 
along with simple examples. Me present in Section 4.4 the 
complete non cooperative solution of the double-integral 
plant introduced in Chapter III. 

4.2 CLASSIFICATION OF SWITCHING SURFACES 

An exhaustive classification of switching surfaces 

can be made considering the nature of optimal paths on 

: 

the surface and its immediate neighbourhood, i.e., whether 
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the optimal paths enter, leave or are parallel to the 
- surface (not necessarily of the same nature on either 
side). Isaacs resorted to this classification and 
showed that some of the surfaces under this classification 
are unrealizable. His Universal, Dispersal and Transition 
Surfaces derive their names on this basis. Any switching 
surface can be labelled as belonging to the players whose 
strategies are discontinuous across the surface. 

A different classification is based on the method 
of construction or the condition to be satisfied on the 
switching surface. Thus Transition Surfaces are constructed 
by the application of corner conditions (3.24). The other 
candidates in this classification are the Singular, 

Dispersal and Abnormal Surfaces. 

Singular Surfaces are surfaces containing singular 
optimal paths. A definition of Robbins (1967) is genera- 
lized here. An Extremal Arc is Singular, if at each point 
of the arc there is some allowable first-order weak control 
variation for at least one player p, which leaves his 
corresponding Hamiltonian HP unchanged to second order. 

This condition reduces to HP being singular, if 

uPuP 

2 Isaacs (1965) uses the term Singular Surfaces in place 
of switching surfaces used in this thesis*. We use the 
term Singular Surfaces to represent surfaces containing 
singular solutions. 
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the control vector uP i s Interior to its restraint set 
Xv_P. In other words, on a singular arc the Legendre- Clebsch 
condition is satisfied in its weak form only. As in optimal 
control problems, the most common examples arise when the 
Hamiltonian of a player is linear or sectionally linear in 
some of his control variables. Most of the Universal 
Surfaces of Isaacs (1965) fall in this category. The 
o ccurance of these arcs in relaxed variational problems is 
well-known (Warga 1962). 

The Dispersal Surface of a player consists of 
starting points in corresponding to which the player has 
multiple equilibrium sequences yielding the same Value to 
him. These are thus identical with the surfaces of the 

Regular Decomposition in Section 3.2. The Value Function of 
any other player whose strategy is continuous across this 
surface may be di seen tin uous because of the nonequivalence 
of the different equilibrium strategies. A thorough 
presentation of these surfaces for the two-person zero-sum 
case is given by Isaacs (1965). 


Abnormal Surfaces are surfaces containing abnormal 


solutions. On the abnormal paths, the minimum principle in 

Chapter III is satisfied with = 0. Thus A p = 0 in 

. o o 

the expression (3.20) for the Hamiltonian H p arid the 

Transversality and Corner QeSs#tions (3.22) r (3.25), It 

1 
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is well-known in optimal control that there may he no 
neighbouring curves with admissible controls in the case 
of abnormal solutions. The Abnormal Surfaces in problems 
with time as payoff have a special significance as i.s 
evidenced by the concept of Barrier in two-person zero-sum 
games studied by Isaacs (1965). 

4.3 CONSTRUCTION OF THE SWITCHING SURFACES 

The construction of switching surfaces in optimal 
control problems appears in contemporary literature. The 
familiar problems are the ones with the Hamiltonian linear 
cr secticnally linear in the control variables which are 
bounded. The resulting bang-bang, three -level etc., 
controls are given In terms of sign urn, dead-zone etc., 
functions of a suitable Switching Function with the state 
and adjoint variables x,A as its arguments. Under the 
usual smoothness assumptions on the formulation functions 
as given in Section 3.2 - f and L are assumed class C^^- 
the Switching junction as well as A are continuous 
functions of their respective arguments. In these problems 
the construction of switching surfaces is an easy matter. 


On the other hand in Differential Games, the 

■' . *\ " -Jfc* - 5? 



Switching Functions as well as the adjoint variables 
need not be continuous in spite of similar smoothness 
assumptions on f and L* 5 (see Example 3.2). This 
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situation arises because the discontinuities in the optimal 
strategies of the other players reflect in the dynamic 
equations of the one-sided optimal control problem of the 

i 

remaining player p* The preceding result is true whatever 
be the information patterns to the players. The corner 

ill*' 

condi tions stated by Berkovitz (l96l) for such a problem 
are equivalent to the- results in Section 3.2. In the same 
vein, it is shown in (Sarm a et.al. 1969) that if all other 
players have continuous strategies at the switching surface 
of any player p, then his adjoint variables > p are 
continuous across this surface. It is for this reason 
that in Sk ample 3.2, the adjoint variables for player 2 are 
continuous at the comer represented by time tg (see 
Figure 3.2). 


The possible discontinuities in the adjoint 
variables together with the simultaneity involved in 
obtaining the strategies of all the players makes the 
construction of switching surfaces more difficult in 
differential games. This explains to large extent the 
Bang-Bang -Bang surfaces, joetlarly named so by Isaacs (1969) 
With these remarks, we indicate the construction of the 


specific surfaces below. 




m 


have been 


Singular Surfaces: 

. , 

Singular extremals in optimal < 
studied in the literature (for example Johnson 1965, 
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Kelley et.al. 1966 and Robbins 1967) and their study is 
linked with the Hilbert Difft rentiability Condition of 
the classical variational calculus. Of these the 
Robbins’ version of the second variation method is 
prt sente- i here so as to be applicable to linear singular 
arcs i*± Differential Gaffes. Linear singular arcs arise 
because of the Hamiltonian H p of a player being linear 
in some components of u p and when u p is interior to 
its restraint set. In the remaining cases, either the 
situation is too transparent or it can be dealt on similar 
lines as in optimal control problems. 


The linear singular control variables cannot be 
determined from (3.37) of the minimum principle. However, 

1 ifferentiation of H^ p a sufficient number of times equal 
to the order of singularity will determine the singular 
control variables after suitable manipulation. The mani- 
pulation consists in substituting the canonical equations 
and the expressions for the nonsingular control variables 
after each differentiation. The generalized Legendre- Clebsch 
condition is stated as a test for the optimality of singular 
extremals. 


Generalized Le gendre-debsch Condition: lor the player 
P» is formed successively for different values of 

k, where 

f* « ® 1 


(4wl) 



k should 


The first time is not equal to a null matrix, 

be even, i.e., k = 20. The matrix (- 1 / & p must be 

^ 2 # 

positive semi definite. If this condition fails, the singular 
extremal is not optimal, £he two-person zero- sum version of 
this coalition is given by Anderson (1969) along with a few 
junction conditions. These junction conditions are however 
applicable Qi.ly when the remaining player uses continuous 
strategy across the junction. 


If the different control variables have different 
orders of singularity, the test is applied successively and 
the variables are reduced in their order of singularity. If 
the control variables of several players are singular, the 
test should be applied simultaneously to all the corresponding 
Hamiltonians. Thus the actual construction of Singular 
Surfaces is accompanied with several analytical difficulties. 
We present here a simple example. 


Example 4. lx We shall examine the possibility of singular 
solutions in Example 3.2. The game satisfies the equations 


± mm v 
^ * *2 

• 1 . 2 
x = U x + cu 
2 


(4.2) 


referred to as the double integral plant with the two 
inputs constrained as follows. 



The payoff functionals of the players are given by 

jHv = I f dt 
0 

t (4.4) 

J 2 tx 0 , li] = j f {|ui| + b|u 2 |^ at 

whe re x Q is the initial state and the terminal state is 
specified as the origin. Equations (4.2) - (4.4) are 
identical with (3.62) - (3.64). 


The appJication of the necessary conditions is shown 
in Section 3.4. The Hamiltonians, adjoint equations and the 
optimal control actions for this problem are given by (3.65), 
(3.?l) and (3*66) respectively, and are reproduced below for 

ready reference. 

H-kx, ]£, A 1 ) * 1 + Xg + (u 1 + cu 2 ) 

(4.5) 

n (x, li A ) = (u 1 ! + h|uk + xg + ^(u 1 + cu 2 ) 



for p * l, 2 and 

l* t 

u x * - sgn A| 

2 * 2 
u = -* dez( c/b) 

o 


(4.0) 


(4.7) 


fhe signum and deadzone functions in (4.7) are defined 
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A singular u 1 requires that ^l(t) = 0 and 

^l(t) = - X£(t) = 0 which violates the condition H 1 = 0 

on the trajectory and hence does not arise. On the other 

o 

hand* a singular u" requires 


A 2 ( t) c 

_ 


= ±1 


or A?(t) = + ; 

c 


(4.8) 


and 

* 

A|(t) = - A^(t) = o 


(4.9) 


Thus for the terminal sequence 



with 


0 ^ ^ 1 to be optimal j we should have from the trans- 

versality condition (3.76) and (4.8) (considering the 
upper values) and (4.9) 


Xkt) s \J(t) = 0 


(4.10) 


Ag(t) »(1+U) Ap(t) = - 1 -- b l = £ 

2 2 1 + c £ c 


(4.11) 


Equation (4.11) requires that b = c and £ to be a 
constant on the trajectory. It can be seen that this 
sequence does not violate the Generalized Legendre- Clebsch 
Condition. Considering the lower values of (4.8) a similar 
result can be shown for the terminal sequence The 

resulting trajectories for 0 4 £ ^ 1 are shown in regions 
Gg and G q of Figure 4.1. The equation of the trajectory 
along which the state reaches origin with the control lawp^- 
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with 0 <C 6 .< 1 is given by (see Appendix A.) 


iie =i (x 


Xo) 


x P sgn x 2 


1>*2 


X 1 “ " 2(1 + c£) 


5 


(4.12) 


Now we can determine whether these trajectories 
are optimal for the perfect information case by verifying 
the Hamilton-Jacobi equation. Expressing the control laws 
as feedback policies in the Region Gg, we have in view 
of (4.12) and Appendix A 

U 1 = - 1 



x 2 + 2x 1 
2x^ c 


(4.13) 


W 1 (x 1? x 2 ) = 


x 2 

i+ce 


w (x 1? Xg) = 





(4.14) 


Now for player 1, we have the Hamilton-Jacobi equation 


0 = , / i + c- | ) 

lu 1 ! <l\ x 2 


X 2 + 


2x- 


(u 1 


cU 2 ) 


} 


or 



Since (4.15) 
solutions in 
for player 1. 


2x^ 

sgn — §■ = + 1 (4.15) 

x 

2 

contradicts (4.13), the cluster of singular 
Gg (also Gg similarly) are not optimal 


However by writing the Hamilton-Jacobi equation 
for player 2, it can be easily seen that the cluster of 
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singular solutions in 0 5 and' are optimal, for this _ 

player. This result finds application in the next section. 

Dispersal Surfaces : 

In optimal control problems} with the usual 

smoothness assumptions on f and L and the terminal 

surface, Dispersal Surfaces are not usually met with. Since 

for any player, having multiple strategies at a Dispersal 

Surface, the Values corresponding to all these are equal, 

these surfaces are constructed based on this property 

(Isaacs 1965). However on the Dispersal Surface X . 

1 l ? ** i k 

of player p the following condition holds. Tbr any 

dx, dt variations on the manifold (Berkovitz 1964), we have 

H p (x, xf > Uj > t ) - A? dx = H p (x, X?- jUJ »t)->J dx 
1 1 1 1 1 2 x 2 x 2 

1 (4.16) 

* H P (x,A P ,U* ,t)- X P dx 
x k !k x k 

Example 4.2 ; ife construct the Dispersal Surface for the 
double integral plant (4.2) and (4.3) on a restricted 
playing space. The playing space (R and the terminal 
surface '7 1 UJ 2 are shown in Figure 4.2. J and Jg are 
given by 

2 

7 1 = * X- = - — ^ \ 

1 c l 7 2 1 2(l+c) ■> 

2 

Is = *3 5 • X 1 = — — \ 


(4.17) 




Thus T and T are identical with Y“ and Y“ 
1 J 2 11 1 

respectively (see Appendix A). 
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The payoff functionals are of the composite type 


given by 




si! 

= ehx t ) 

t f 

+ I dt 




0 

(4.18) 

J 2 I> 0 , Si] 

= 0 2 (x f ) 

+ f f [ l al l + ^|u 2 | ] 

0 c > 

dt 

wher e x f 

is the 

terminal state at the 

free terminal 

time tf>. 

The functions 0 1 and 0 2 

are given below. 

0 1 (x f ) = \ 

x 2f 

T+c 

£ 7, 

x f e j 8 

(4.19) 


x n . 

2f 



0 (x f ) as. 


r x 2f(^) x f e 


(4.20) 


"2f 


X f e 


The Hamiltonians for the players, the adjoint 
equations and the optimal control actions are given again 
as in (4.5), (4.6) and (4.7) respectively. By the 
application of transversal it y conditions on jfp the 
optimal control sequence for paths that end on is 


given by 


' 1 
. 0 . 


as shorn in (3.84) - (3.87), By a similar 
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application on J 2 , we have 
1 - A^( - x 2f ) “ = 0 

1 + A ^ x 2 + Ap Cu 1 + c u 2 ) = O 

(4.21) 

1 ~ A 1 ( - V = 0 

(u 1 ! + b|u 2 | +\'l x 2 + X g (u 1 + c u 2 ) = 0 


where all the quantities correspond to time t^,. Solving 
(4.21)j (4.6) and (4.7) consistently under the imposed 
condition c > 2, we have 


X £ (tf) = - 




A® (t f ) 
\ 2 S Ct f ) 


-1 - b - 


b(l - c) 
""" c -"T" 


x 2 


b 

c-5 


U' 


(t) = 1 


u 


(t) = -1 for t < t. 


(4.22) 


(4.23) 


Thus when c > 2, the optimal sequence reaching 
and when c < 2 there are no paths reaching 'J'g* 



Thus when c > 2, while the strategy of player 1 
is continuous in the second player's strategy is dis- 

continuous and hence has a switching surface which in this 
case is a Dispersal Surface. For starting points on this 
surface W 2 must be same whether the optimal paths 
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reach J or Tg. Thus making use of (B.9) of Appendix B» 
we have for any (x^, Xg) on the Dispersal Surface 


Xg(l+b) 


1- c 


x^-Ox-^l-c) r 
l+('l-c) 


1+b 

o^l 


2+c+b 


“Xp-*- 


J(l+c)(2+c) 


jx 2 -2x 1 

(4.24) 


Thus the Dispersal Surface is given by 


A » [(x iy Xg) : x 1 = - ~ x| ^ 

where ^ is given by 


(4.25) 


b+1 _ 2+b-c -l+Cc-l)'? = 

c-1 ** c-1 v c-2 


- 1 + 


2+c+b 
( 1+e) (2+c) 



(4.26) 


One can easily see that W 1 is not the same for 
both the optimal paths. Hence the two equilibrium points 
are nonequi valent for player 1. There is no counterpart of 
this result in two-person zero-sum games for obvious reasons. 


A similar construction can be made in the space 

between and as shown by the broken lines in 

lo 11 

Figure 4.2. 

Abnormal Surfaces i 

Abnormal solutions have not been extensively 
studied even in optimal control problems. Often tie 
necessary and sufficient conditions for normality in the 
calculus of variations are difficult to translate into 
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optimal control theory. They are usually constructed from 
their definition. The Semipermeable Surfaces of Isaacs (1965) 
follow this construction and are examples of Abnormal Surfaces. 
The optimality of these solutions can be established easily 
only for the optimal control problems. In the case of 
two-perscn zero-sum games with time as payoff, these surfaces 
have a special significance associated with them, viz., the 
Barrier Concept and the roles of the players in determining 
these surfaces are fixed. We shall show, in terms of an 
example, the construction of Abnormal Solutions* 

Example 4.3 ; We construct the Abnormal solutions for the 
double integral plant problem given in (4.2) - (4.4). Since 
A* =A' = 0 in this case, (4.5) - (4.7) get modified a s 
follows s 


H 1 = A* x g + Xg (u 1 + c u ? ) 

H 2 ” <^1 x 2 + ( ul + c u2 ) 



for „p = 1,2 and 

u 1 * = - sgnAg 

. 2 * V 2 

u = - sgn 



(4.27) 


(4.28) 


(4.29) 
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The curves "( 


are the 


11 ’ ^11 * ^ 1,-1 ’ ^- 1,1 

abnormal curves with the corresponding control sequences as 

t + 1 }T M >| 1 1 and V l . We show this below for N f~ 

+lj l-lj 1-lJ L+li 11 

by showing that it satisfies the required necessary conditions- 
The nthc-rs follow similarly. 

Prom the transversality conditions (3.24) we have 
^1 and A ^ arbitrary and 

A 3 CV = A 1 <V. = ° <4.30) 

By ( A»6) of Appendix A, we have for any initial state 
(x 1 > x a ) on ■{ - 

t, = -22 (4.31) 

f 1+c 

From (4,28), (4.30) and (4.31) on integration, we have 

a# (0) m *l S; (4 - 3a > 


Jiquations (4.32) and (4.29) yield 


ul 


with 



assumed any negative number for 



(4.33) 


The solutions f U “ 4 will be shown to be 

optimal in the next section under certain conditions. This 

— " " 111 a " m m im W — 

3 Equations (3.73) and (3.74) get modified with null vectors 
on the right hand sjd e in this case. 
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is because both the players are primarily interested in 

terminating the game. The remaining curves V and 

1 

'll are P robabl y optimal if we change the role of either 
player to maximize his payoff functional instead of the 
minimization that is assumed. 

There are other switching surfaces in the 
literature which are constructed with the ideas presented 
here. For example, the Equivocal Surface (Isaacs 1965) is 
an Abnormal Surface which is also a Dispersal Surface 
corresponding to one player and a Singular Surface corres- 
ponding to the othc-r. It is a member of a class of 
parametrized Abnormal Surfaces and is determined by the 
conditions corresponding to its being a Dispersal and a 
Singular Surface. As we stated earlier, the construction 
of these switching surfaces is mainly examp le- ori ent ed . 

In the next section we make use of all the examples in this 
section to obtain the complete solution of the double 
integral plant problem.- 

4.4 NOH COOPERATIVE SOLUTION OF THE DOUBLE IN JM3GBAL PLANT 

Here we present the complete solution of the problem 
formulated as Example 3*2. Its solution for the case c>- b 
is already presented there. For this case, there is a 
unique Nash equilibrium sequence for every starting point 
and this is defined as the noncooperative solution. We 
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showed that the solution is essentially the same for the 
cases when the players have no observations as well as when 
they have perfect knowledge of the state of the game. 


The remaining cases c ^ b are solved in this 
section. We presently see that there is non uniqueness of 
equilibrium sequences for certain starting points and the 

' t 

noncooperative solution is to be defined by a suitable 
selection of these sequences. 


2 2 

Case (l) c < b and 2c - b + c + 2 be > 0 % 

A procedure similar to that in Example 3.2 yields 

[ *+1 +1 TlT 

+i "~o o 1 are * n 

Nash ’equilibrium under the condition 


C 


l+« 


(4.34) 


where 


o< T _ b - c - 2bo 


(4.35) 


(b- c r / 

ConditionX4.34) is equivalent to the assumption 
2 2 

2c - b + c + 2bc > 0. This control law can be stated as 


follows in accordance with Figure 4.3 (ii). 

x | 

^10 = x 2 5 : x ! = - — s S n *2 j 

p r = ^ (x 1? X 2 ) : x 1 - - 


(4.36) 


' r x 2 
x 2 


sgn Xg 


1 


2 


(4.37) 
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(x^Xg) e 



(U 1 ,!! 2 )- = ( 1, 0) 


(x^Xg) 6 G 3 

u^) e 

(x i’V e 040/^ 


(u^jU 2 ) = (-1, 0) 

(u 1 ^ 2 ) = ( 1, l) 

(u^u 2 ) = (-1,-1) 


(4.41) 


Considering as the terminal surface, -we can 

construct the equilibrium sequences f ~ 1 * 1 1 under the 

— ■ +1 0 

assumption c ^ 2, No such sequences however exist when 
c ^ 2. This was shown in Example 4.2 where Y^ 0 can be 
identified with 3 * 2 * In bhe same section we also constructed 
a Dispersal Surface A for the second player when c > 2. 

All this is shown in Figure 4.3(ii). Thus we have 


(x^Xg) e G7 

u v x 2 ) e gq 

(X^Xg) e Gg 

(x^xg) e 


(u\u 8 ) = ( 1, 0) 

(u’-.u 8 ) = (-1, 0) 

( 4*42) 

(uV) = ( 1,-1) 

(u x ,u 2 ) = (-1, l) 


Now we have to define the non cooperative solution 
taking into account the various equilibrium sequences 

r±i ±1 +1] r+i ±1 +11 r+i +11 

1 , ■ _ ] > \ \ and \ and the decom- 

L+i 0 +ij L+i 0 oi L + 1 °J 

position associated with them as shown in Figvjre 4.3, We 

can easily see that control sequences corresponding to 
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Figure 4.3(i) are preferred- by p- layer l.and those . corresponding 
to Figure 4,3(ii) are preferred by player 2 . To illustrate, 
for any starting point (x^jXg) on > we can compare the 

Values to both the players corresponding to both the cases of 
Figure 4.3, The Values can be calculated from (a. 6 ) and (B. 9 ) 
of Appendices A and B. Thus for Player 1 we have to verify 


that 



(4.43) 


or that 

J , ?i|t £2 - l < l i.e. 4+ 2 c < 4+4c (4.44) 

which is true since c> 0 . Similarly for Player 2 , we have 
to verify that 


- * 2 + r 2 Xg [ 

1 +c 


S? [1 + 13 > *3 


^ >1 which is true since c > 0 . 

3+c^ 2 


or that 

preferences of the players follow similarly 


(4.45) 
The stated 


ho Observations to the Players : Vfe can see from Figure 4.3 
that the players have full agreement over the Regions G 7 and 
G q and this constitutes the noncooperative solution on these 
regions* As remarked earlier these regions extend completely 
between and Y when c ^ 2 . 

Pbr other regions, the strategies are not inter- 
changeable and hence present a coordination problem. The 

. , ■ • ... . ■ : , , ... 
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game cannot / reduced using the Payoff Dominance- Concept , 
since ef the two equilibrium strategies which are noninter- 
changeable, cne is preferred by one player and the other by 
the remaining player. Risk Dominance also does not apply 
since the recombined strategies are not playable 4 . This 
Deadlock cannot thus be resolved as in finite games. It 'may 
be possible to use mixed strategies to define the Tacit and 
Vocal solutions in this case. This problem is suggested for 
future research. 

Perfect Information to the Players : Since the players 
have a running knowledge of the state, it is helpful to 
them in terminating the game in which both are interested. 
The problem cited above does not arise here since the 
recombined strategies are playable. Thus we can calculate- 
the Risk to the players when each of them sticks to their 
preferred strategies, i.e.,p?layer 1 plays the strategy 
indicated by Figure 4.3(i) and player 2 that indicated by 
Figure 4.3(ii). 

As an example, for any starting point (xpXg) on 
V’ l0 , it is easy to see that the state of the game follows a 
chattering path to the origin along Y” 0 C shown exaggerated 
by broken lines in Figure 4.3(ii). The total time taken for 

4 This shows that they are derived from two different 
Normal Fhrms of the game. ...... 



same as the cost to p layer 1 -corresponding to 


this is x 2 » 

the equilibrium strategies preferred by player 2, shown in 
Figure 4.3(ii). 


Of the total time, for the fraction 


c-2 


the 


players use Cl and for the remaining time they play 

[li] sinoe 


3=2.1 + f (1-0) = -1 


(4.46) 


The cost to Player 2 corresponding to this chattering 
path is 


[¥ •' 1 + f U+b) ] = (1+ f> "2 (4 ‘ 47> 

We also calculate from (B.9) of Appendix B the cost to 
Player 2 corresponding to the equilibrium strategies 
represented by Figure 4.3(i). It is given by 


~ x 2 + 


2+c+b 


= x. 


>|(2+c ) ( 1+c ) 
(2+c+bX/"2 


I 


Xo + 2x-> 


vj ( 2+ c ) ( 1+ c ) 


- 1 


(4.48) 


On simplification* (4.48) is less than (4.47). 

Hence while there is no risk involved for player 1, there is 
considerable risk for player 2 in adhering to his preferred 
strategy if the opponent also dees the same. He derives a 

Value inferior even to the equilibrium point preferred by 

* 

his opponent. 
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Hence the noncooperative solution for this case is 
as represented by Figure 4.3(i). This solution consists of 
both nor «nal and abnormal arcs. 


r+i +1 
to o 


1 


Case (9) c < b and 2c - b~' + c" + 2bc < 0 : 

In this case the equilibrium sequences are 
as shown in Appendix C and the switching curve p T goes into 
the space between Y^ and Y^q (see Figure 4.4). The 
abnormal solution Y^ no longer satisfies the Envelope 
Principle of Isaacs (1965) and hence does not correspond to 
Nash equilibrium. In the region between p l and there 

are no equilibrium solutions reaching p’. However for the 
case c > 2, we have ul as equilibrium sequences 
corresponding to solutions reaching Ylo as shown in 
Example 4*2. 


Perfect Observations to the Players i .An interesting 
strategy available to player 2 to reach p’ when the game 

/ T 2 

in between y and p is to use u =0. Since 

player 2 thus leaves the optimization problem entirely to 

his opponent, we call this his Abstaining Strategy. Player 1 

through his observations can detect this and the best 

strategy available to him under this condition is to play 

u 1 = +i depending upon whether the state of the game is 

below or above V • Now when c > 2 one can fit a 

10 ' 

Dispersal Surface A for Player 2 separating his 
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Abstaining Strategy and the strategy \ l leading to Y . 

L^l i 10 

However this strategy is risky to Player 2 in the same way 
as discussed in Gase Cl). Thus the solution consists of the 
Player 2 completely abstaining from the play and leaving 
optimisation to Player 1. Thus in the resulting time optimal 
problem, Player 1 uses u 1 - +1 in the region below ^10 
and u 1 ® -1 above Y in Figure 4.4. It may be noted 
that except in the regions between Y and p T where the 
game has no equilibrium point, the rest of the strategies 
are in equilibrium * 

Ho Observations to the Players i Since the game does not 
exhibit multiplicity of equilibrium sequences, the above 
solution is valid except in the Regions between Ao and p . 

T 

Case (3) c < b and 2c - b^ + c^ + 2bc =0 i 

This separates Cases (1) and (2). Considering in 
the limit the solutions of Cases Cl) and (2) for this case, 
it is obvious that the Abstaining Strategy is suited best 
for Player 2 in the perfect information case. As in the 
earlier cases, it is not possible to define completely the 
solution for the no observations case.- 

Case (4) c = b ; 

In Example 4.1, we observed that the sequences 
with 0 ^ € < 1 qualify as equilibrium terminal sequences 
as they do not violate the Generalized Legendre- Clebsch 
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for 


e < 


+1 0 +€ 

■q— (see Appendix G). 


with 0 ^ ^ 1 and 


Condition. Thus one encounters an infinite number of 

i -n i -t 

equilibrium sequences ' — x — 

'll Til 

JH K i 


r+1 ±l Til 

L + 1 0 +€ -1 


The Payoff Dominance concept applied to these 
equilibrium sequences yields that Player 1 prefers the 
sequences f ^ with the corresponding decomposition 

L+i o +iJ 

shown in Figure 4.5(1). Similarly for Player 2, the preferred 
decomposition is represented in Figure 4.5(ii). The switching 
curves p and p l given by (4.39) and (4.37) to this 
limiting case become identical with the x^ - axis. 

No Observations to Players i For starting points on 
there is full agreement between the players. Since the 
recombination of the individually preferred strategies are 
not playable for any other starting points > the game ends in 
a deadlock as pointed out in the earlier cases. 

Perfect Observations to the Players : As in Case (l) the 
recombined strategies are playable in this case. Also 
similarly it is risky to play this strategy for Player 2 in 
the same sense. A repetition of the arguments in Case (1) 
gives the solution in the present case as given by 
Figure 4,5(i). 


We give below 


... . , . a£y .of^hev, example and 

its solution as presented in Example 3.2 and this section. 

'■■■ ' ' ■" • ' ''A" ' . . . " ' ' " ' 
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Summary of the Results : 

The game satisfies the equations 


H = X 2 (4.49) 

% - u 1 + cu 2 

2 

The state of the game is to be transferred from x 0 to the 
origin by the control actions u^u 2 of the players which 
are constrained as follows ; 


lu 1 ! ^ 1 ; |u s | $ 1 (4.50) 

The payoff functions of the players are given by 

jHv “3 = l f dt 

(4.51) 

J 2 Cx 0 , u| = X tf ^lu 1 ! + b|u 2 j^ dt 
which the respective players wish to minimize. 


The solution of the game is defined below for the 
perfect information case. The solution of the game with no 
observations is defined only for the case c^b and is 
essentially the same (except for implementation) as the 
corresponding case with perfect observations. 

The equilibrium sequences for any starting point 
are unique when c y b and the solution and the switching 
surfaces are indicated in Figure 3.3. The equilibrium 
sequences are nonuaique for the case c < b with the added 
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condition 2c - b + c" + 2bc > 0 and the case c = b. In 
particular there are uncountable number of sequences for the 
latter case. In these cases the solution is defined by a 
suitable selection of the sequences based on the concepts 
of Payoff and Risk Dominance. The switching curves for 
these solutions are represented in Figures 4.3(i) and 4.5{i). 

The re sill ting solution in the above cases is 
essentially similar with the switching curves given by 
similar expressions. The switching curve lies in II and 
IV quadrants pf the state space when c > b and gradually 
becomes closer and coincides with the .x^ - axis when c = b. 
With c < b it changes quadrants. 

2 2 

When c < b and 2c - b" + c + 2bc < 0, there 
are no equilibrium sequences for certain initial points. 
However, the solution is defined through a certain abstaining 
strategy of Player 2 in which he sets his control variable 
at zero and leaves the problem to Player 1. Thus the resulting 
solution of the well-known time optimal problem is shown in 
Figure 4,4. 


The solution seems biased to Player 1 since while 
he is inte rested in time only, Player 2 is interested in a 
performance index which has a weightage for the first 
player’s fuel alsol Finally since each player is interested 


in the termination of the game, if 


osre player has observa- 


tions, then he has to follow his opponent’s preferences. 
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4,5 CONOID 3I0WS 

In this chapter, the various swi telling surfaces 
encountered in optimal control and differential game problems 
are classified and general construction procedures are indicated. 
Though at present the study of these surfaces is mostly moti- 
vated by examples, we concur with Isaacs (1969) that a 
comprehensive theory of differential games can be developed 
through a thorough study of the switching surfaces. 


The complete noncooperative solution of the double 
integral plant problem is presented. In the process, we c$me 
across many features which are met with in finite games such as 
multiplicity, nonequivalence and noninterchangeability of 

equilibrium points, application of Payoff and Risk Dominance 

' :*'■ '' '" 0 : . ’ . ■' . ■ 

concepts for defining solutions etc. However the nonplayability 

of recombined strategies as we saw in Section 4.4, consequent to 
the differential game having different Normal Forms has no 
parallel in finite games. 


An important method of obtaining solution, viz., by 
numerical computation, is very difficult in these problems 
because of the dimensionality of the problem, nonavailability of 
reliable and efficient computational methods in finite games and 
because of each player’s ^ost* at the equilibrium point being 
insensitive to his own strategy but being sensitive to the 




105 


other player’s strategies (see Starr 1969). ibove all the 
noncooperutAve solution requires a selection of the 
equilibrium sequences which are sometimes uncountable. 

In the next chapter we consider Cooperative Solutions 
of differential games. 



CHAPTER V 

COOPERATIVE. SQLuELQ^S AMD MULTI CffiTERIOM OPTIMAL -COHTHOL 
S.l INTRODUCTION 

In Chapter II we observed that Pareto optimality 
is a weak optimality concept and that it is the central 
theme of all Cooperative Solutions. This chapter deals 
with the cooperative solutions of N-persori differential 
games and the application of the results developed in this 
thesis to Multicrite rion Optimal Control. 

The necessary conditions for Pareto optimality 
derived by Chang (1966) and Das and Sharma (1969) are 
stated in Section 5.2. This is followed by the cooperative 
solutions of differential games. We discuss the theory of 
multicrit? rion optimal control problems in Section 5.4. 
Specifically we Show that multi criterion optimal control 
problems can be solved as cooperative N-person differential 
games, with equal information to all the players. 

We describe in Section 5.5 a problem in optimal 
control giving the, Sensitivity of optimal control for small 
changes in performance index and adapt this as a computational 
method of obtaining the solution'mf multicriterion optimal 
control problems. In Section 5.6 we complete the Bicriterion 

■■■■' ' ' , ■ ,,-i~ .■ ■■ ' .■■*» , *\ . . % 

' ; -i', • - - ■ -■ ■ . ■■■ V '■ 

Optimal Control Problem of minimizing the time and fuel of 
a double integral plant carried throughout the thesis. 



107 


5.2 PARSE* CP MALI 12 CONCEPT 

recall the deterministic differential gam-e 
f ormul ated. in section 2*3. The game satisfies the state 
equation 

x 9 f(x, ji, t) (5.1) 

where x and u are of dimensions n and r respectively 1 . 
The initial condition at time t 0 and the terminal surface 
are specified as 

x(t Q ) = x Q (5.2) 

Ay(x f ,t f ) = 0 (5.3) 

or alternatively as 

t f » T( O ; x f * X( 6 ) (5.4) 

The player chooses his control action 

^ ervP, so as to minimize his payoff functional 

t r 

t 0 ,li] ~ 0 P (x f ., t f ) + f L P (x,u,t)dt (5.5) 

t o 

The payoff functional vector £ = (J 1 , . ..J 1 *) 
introduces a partial ordering (denoted by ^ below) on the 
admissible joint control actions u = (u 1 , ...u^) of the 
playersi According to this ordering, we have for any two 

1 N 

1 It is obvious that r = + ... r , 



108 


joint control action vectors u^ and u 2 

Si 4 Ug if and only if [j£x 0 , t^uj 4 J[x 0 , t^Ug}^ (5.6) 

A vector is said to be Less Than or Equal To another (as in 

the right-hand side of (5.6)) if and only if each component 

of the first is less than or equal to the corresponding 

p 

component of the other . We also say that the first vector 
is Below the second vector if all these are represented in 
a proper vector space. Thus 

^sLLxojt^uJ 4 J[x 0 ,t 0 ,u 2 ] ^ if and only if 

5 J p [x n ,t 0 ,uj 4 jPlx 0 ,t 0 ,u 2 ]Z ( 5 .7) 3 

( p = 1, ...N > 

A control vector u° is Pareto Optimal if there 
exists no other admissible control action which yields a 
payoff vector Less Than J[xQ,t 0 ,u°] = V(x 0 ,t 0 ), where 
V is called the Value Function. Thus for any u, we have 

I[x 0 ,t 0 ,u] 4 I(x 0 ,t 0 ) only if J[x 0 ,t Q ,uj = V(x Q ,t 0 ) (5.8) 

Thus u° is weakly optimal with respect to the partial 
ordering defined by (5.6). The necessary conditions for 
Tji 0 to be Pareto optimal are stated below. 

2 This applies equally well to the relations Not Less Than, 
Less Than, Equal To, Strictly Greater Than and Greater 
Than or Equal To. Also it should be noted that Not Less 
Than is not the same as Greater Than or Equal lb. 

3 Thus this in turn defines a partial ordering on the 
yt. space. 
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In order that u° and the corresponding 
trajectory x° to be Pareto optimal, it is necessary that 
there exist a nonzero absolutely continuous vector function 
(L, A(t)) 4 = (fc 1 , A^t),... A n (t)) with the constant 

vector JK ^ 0 such that the following are satisfied 4 
(i) Euler-L a grange equations; x°(t) and >(t) are a 


solution to the canonical system 

*° - <x°,A,u°,t) (5.9) 

o' A 

\ = - ~ (x°, A ,u°,t) (5.10) 

8x 

satisfying the boundary conditions 

x°(t Q ) = x 0 (5.11) 

o^(x°(t f ),t f ) = 0 (5.12) 


where the Hamiltonian function H is given by 

H(x, A ,u,t) = < k., L(x,u, t) > + <A,f(x,u,t)> (5.13) 

(ii) Transversality conditions: it the terminal time t f 

d [ < £,£(*?> t f )> + <'$, (x°, t f )> ] 

Mt f ) = f f aif f ' £ (5.14) 

< lL>£LC4> t f» + )>] 

H(t ) = - — Tr "" " — “ (5.15) 


where is some constant vector, .Alternatively, 

<L + H(t f ) 1^- X(t f ) = 0 (5.16) 

— > 'i,,!, - , : 

4 It is to be noted that K is similar to A 0 in ' 
Pontryagin r s Minimum Principle; 
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(iii) Minimum principles The function H(x°,> ,u ,t) has 
an absolute minimum as a function of u over jy at . 
u = u°(t) for t in [t Q , t , i.e., 

H(x°, \ ,u°,t) * min H(x°,X,u,t) (5.17) 

u £_rv 

Since if t > o, we can choose such that 

h ^ 4-...^^ = 1 (5.18) 

it follows that for different Pareto optimal points, we are 
minimizing different convex combinations of the payoff 

C 

functionals . The significance of these necessary conditions 
is this important scalarizaticn of the vector functional 
problem. It is clear from (5.13) and (5. l?) that H and u° 
and hence the Value Junction T ' are functions of Vl. 

The above results are derived by Chang (1966). £slow, 
•wc consider the time-in variant Lagrange problem, i.e., f 
and L p are not functions of time explicitly and for 
p = 1, . . . N 

0 P = 0 (5.19) 

By the known equivalence of this problem and the problem 
(S.l)-(5.5),the stated results hold. 

5 A problem with any particular convex combination of the 
functionals may not correspond to any Pareto optimal point 
inasmuch as the stated conditions are necessary only. 
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The proof given hy Chang (1966) is along the same 
lines as the one given by Pontryagin (1962), One can also 
construct a heuristic proof similar to the one given hy 
Athans and Falb (1966). shall only indicate the changes 
to be made to the proof in (Athans and Falb 1966). 

In the present ease, we have to define an auxiliary 
variable z P for each payoff functional J p and a vector 
such that 

55 L(x,y.) 

(5,20) 

z p (0) « 0 

since we are considering the time-, in variant problem. Also let 
d = (z z N , x) 

q. = (z, x) (5.21) 

Interpreted in this context, the principle of optimality 
implies that any trajectory q in the cost- state space 
cannot be Below (not same as Above because of Footnote 2) 
the optimal trajectory q°, i.e., for any q = (z, x) 

such that 

x - x° (5.22) 

it follows that 

0 y ,0 

z v Z 


(5.23) 



By the usual temporal and spatial variations of 
control, the Terminal Cone is constructed and it similarly 
follows that the Cost Orthant (set product of the cost 
half- rays) given by 

^ : q = (z, x°) 5 z ^ z° ^ (5.24) 

does not have an intersection with the interior of the 
Terminal Cone. Thus the existence of the Separating 
Hyperplane along with the adjoint equations gives the 
minimum principle. Trans versa lity conditions also are 
obtained in a similar manner. 

The questions related to existence of Pareto 
optimal solutions are considered by Olech (1957) and 
Das and sharma (1969). 

/ 

5.3 COOPERATIVE SOLUTIONS OF DIFFERENTIAL GAMES 

Some typical examples of Cooperative Differential 
Games arise in situations involving Collision Avoidance of 
approaching Aircraft and Naval vessels and Rendezvous of 
two Spaceships (tong 1967). In these problems both the 
players have the same objective functional and can be 
considered as agents of the same controlling agency. Thus 
in terms of solution, they are no different from the 
optimal control problems. The multi pursuer- single evader 
game suggested by Isaacs can be reduced to a two-person 
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zero-sum game on similar lines since all the- pursuers 
have the same objective of capturing the evader in minimum 
time by joint effort. 

However, in general the payoff functionals of 
the players may be all different as is indicated in the 
formulation of the problem in this Thesis. In such a case, 
the solution is given by Pareto optimal points. Since 
these points are nonunique Supercriteria are required to 
solve the game. The various cooperative solutions studied 
as arbitration schemes differ in terms of the Super criteria 
involved. The resulting solution should reflect the 

dp . ■ 

strategic potentialities of the players effectively. These 
include the threat capabilities, powers of forming coalitions 

■ 

etc. which are reflected in the non cooperative play of the 
game, which is thus necessary in some form for the super- 
?criteria. 

As we observed in Chapter II, the solution of a 
game in which comparison of payoffs and sidepayments 
between the players are permitted, is given by that Pareto 
optimal strategy which minimizes the Total Payoff expressed 
in the common unit of comparison. Thus once again the 
problem is reduced to an optimal control problem. The 
distribution of the total payoff among the players is 
termed the Redistribution Problem and is solved by the 
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Characteristic Function Theory. The main concept of the 
theory is the Characteristic junction itself defined for 
any subset of players (termed Coalition) m the earns. It 
is given by the Security Level of this Coalition“< considered 
as one player) in a two-person zero-sum game wth the r6st 
of tne players forming the opponent and the payoff glv6n 
by the sum of the payoffs of the individual playfrs ^ 

the Coalition. Thus the extension of these concepts to the 
present setting is straightforward. Fbr exam plc , for tte 
double integral plant considered in Example 3.2 
Section 4.4, the above consideration reduces to one of 


finding out who is the stronger player of the two 
whether 0 > 1 or c < 1. W e will not consider 


) i » 6 ■ } 
this 


aspect any further. 


The solution of the game without sidepayments is 
given as the Nash Cooperative Solution 6 (Nash 1953 
Harsanyi 1969 ). In this case the Dilemma ls rcsolved by 
considering the Bargaining between the players with the 
initial point as the noncooperative solution, rhe resulting 
solution has many desirable properties such as symmetry, 
independence of irrelevant alternatives and invariance with 
respect to utility transformations and reflects the optimal 
threats and demands of the players. 
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The underlying Supercriterion is defined as 

iCx o) t ,k) = - ? [ 0 ,t ) - J p [x ,t ,u°(k)] ^ 

P=1 

N n D 

= - ir [wP(x ,t 0 ) - r(x 0 ,t 0 , £)] (5.25) 

P=1 

with the terms in the product of (5.25) corresponding to 
all the players are positive. The solution u°( £°) is 
Pareto optimal and minimizes the above Supercriterion for 
all values of \c, i.e. , 

min KXq > t 0 , |c ) = X(x ,t , \c°) (5.26) 

k. 

It is easily seen that the minimization can be performed 
over all possible values of \c because all the terms in 
the product on the right-hand side of (5.25) are positive 
(see Footnote 5). 

5.4 MUL IX Gift TEBI ON OPTIMAL CONTROL PROBLEMS 

In this section, we study the formulation and 
solution concepts of Multicriterion Optimal Control 
Problems. The formulation of Multicriterion Optimal 
Control Problems follows the same pattern as the classical 
optimal control problems except that there are multiple 
criteria in this case. 

The state of the system x of dimension n 
satisfies the vector differential equation 

X = f(x,U,t) * 


(5.27) 
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where u the r - dimensional control action vector is 
restricted such that 

u £ -Tu (5.28) 

The state of the system has to be transferred from an 
initial state x(t 0 ) = x Q to the terminal surface given by 

(x f , t f ) = 0 (5.29) 

The control law is to be chosen so as to minimize 
the following criteria for p = 1, ... N. 

tf 

J p [x 0 ,t 0 ,u] SE 0 P (x f , t f ) + I L P (X;U, t) dt (5.30) 

t o 

Also the measurable system outputs may be given by the 
observation equation 

y = h(x,t) (5.31) 

Because of the presence of more than one criterion, 
the performance of a control law in relation to another may 
be better with respect to one criterion ahd worse with 
respect to another. Expressed mathematically, a vector 
criterion induces only a partial ordering on the set of 
control policies while a scalar criterion induces a total 
ordering. 

A control law, ^i$h the property that the system 
performance cannot be improved with respect to any of the 
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criteria without simultaneously deteriorating the 
performance with respect to some other, is a basic concept 
Of solution. It is called Noninferior because of its weak 
optimality property with respect to the partial ordering. 

It is clear that Noninferiority is equivalent to Pareto 
Optimality in Section 5.2. Also we saw that the control 
laws with this property are nonunique and this is the 
Dilemma introduced by the. partial ordering. Without 
further knowledge (or alternatively without the application 
of supercriteria) there is no way of sifting through the 
various Noninferior control laws to obtain an acceptable 
solution which should exhibit the tradeoff factors between 
the criteria effectively. We consider below a few methods 
suggested in literature (Nelson 1964, Chyung 1967 and 
Athans and Falb 1966) as such Super criteria. 

One of the methods due to Nelson (1964), specifies 
some acceptable bounds on all but one criterion and then 
optimizes the system performance with respect to this free 
criterion. In essence, the problem is reduced to an 
optimal control problem with several isoperime.tric 
constraints (Lee 1966). The classical design techniques 
patently follow this idea, in which it is usual to put 
bounds on all the criteria such as phase and gain margins, 
rise time etc. and any solution satisfying these bounds is 
taken as satisfactory. Thus there is no optimization 
involved here. 
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A second method is to order the criteria according 
to their importance and apply them in a heirarchy (Chyung 
1967). Thus, if the optimal controls resulting from the 
application of the most preferred criterion are nonunique, 
then they are- tested with the second criterion etc. until 
the resulting control is unique. This method does not 
ensure the participation of all the criteria in the final 
selection of the resulting optimal control law. Such 
method is suited and resorted to in the Ghebychev*s problem 
(Johnson 1967) and fuel optimal problem (Athans and Falb 196i>) 
where the optimal controls with respect to the first 
criterion are grossly nonunique. 


A third method suggested in literature (Athans 
and Falb 1966) is to weigh all the criteria with positive 
weights and a control law is selected optimal with rGspect 
to the weighted index. However many problems may not 
permit the assumptions in the above methods. 


In this thesis, we treat the multi criterion 
optimal control problems under the framework of N-person 
differential games (also see Sarma and Prasad 1969). We 
assume that the control resources represented by u in 
the formulation (5.97) - (5.31) can be allocated to the 
various criteria. As we pointed in Chapter I, this may 
be naturally given. Insofar as the performance criteria 
are a. mathematical characterization of some of the 



objectives} we believe that the plant chosen will usually 
allow such an allocation. 

Thus a multicriterion optimal control problem 
after such an allocation is similar in form to a 
differential game without sidepayments and with as many 
players as there are criteria. The addition restriction 
is that the observations y in (5.31) are the same for 
all the players. It is for convenience of solution that 
the Designer breaks the problem and casts it into a game 
with one player corresponding to each criterion. The 
solution obtained in this case should more truly reflect 
the tradeoffs between the indices. 

5.5 A COMPUTATIONAL METHOD POR NASH SOLUTION 

In this section, first we present a problem in 
optimal control giving the sensitivity of the optimal 
cost and optimal control for small changes in the 
performance index. Pbr simplicity we assume a fixed time, 
free end point problem and the control to be unconstrained. 

The system equations and the performance index 


are given as 
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The optimal control u° which minimizes (5.33) is a 
a function of £ and the resulting optimal Value is given 

by 

V(x 0 »t 0 ,6) = J[x 0 ,t 0 >u°(e)) £"] (5.34) 


The optimal control for the case £ = 0 is given 
for the above problem. The optimal control and the optimal 
cost are to be obtained for any small nonze'to £. Now to 
a first order approximation, since £ is small) we have 

u°ce) » u°co) + 11°* e (5*35) 

and 

v(e> = vco) + ||* e (5.36) 

Hence we need determine expressions for ^ and 
for the above problem. 


The optimal Hamiltonian and the canonical 
equations for the problem are given be^ow. It is obvious 
that the canonical variables are functions of £. 


H(£) 


A f(x°,u°, t) + L(x°,u°,t) + £L f (x°,u°, t) 
r) H ^ 

h(o) + e ^ 


±° = ; x°(t 0 ) - X 0 



(5.37) 

(5.38) 

(5.39) 



Thus writing the expressions for 


ax° 

d£ 


and 


(dx°) 

ld£ ) 36 


d£ dx° . df 3u° 

^x 6 ae 


<ae / ax°ae + au°ae 7 ae 


* £ ( Sb 2. = o 


/&\\ s _ 8 2 H(0) dx° _ g 2 H(0) gu° _ £L’ 


vae^ 


dx 0 " ae ax°au° ae 


ax 


3A 

ae ’ 


a_A 

ae 


(t f ) 


d 2 0 dx 2 s a#' 

ax|. ae r ax f 


we have 


(5.39) 


(5.40) 


where all the quantities in (5.39) and (5.40) are evaluated 
at 6 = 0. Pbr optimality of u°(£) we have 

3H(e) aH(o) . a 2 H(o) au° „ a 2 H(o) ax° „ 

• — g. ■ ■..■■■■ m <4* ■ W— »' » i m » — — — * £ <4* rn m mmmmmmmmmm i ... « ■ mm — £ 

au° au° au^ 2 ae au°ax° ae 


.dsaiai a.' e m0 

au°a)K ae au° 


(5.41) 


where the expressions in (5.41) are evaluated corresponding 
to the optimal control, faking the partial derivative with 
respect to 8 , (5.41) yields 

a 2 H(e) _ a 9 h(o ) au° a 2 H(o) ax r + a 2 H(o) a a + ai/ _ 

au°ae ~ au° 2 ae au°ax° ae au°a ae au° ~ ' 


or 

au° _ _ a 2 H(o) ” 1 r a 2 a(o) ax° + a£aA_ + ai* " 
ae au ° 2 Lau°ax 0 *ae au°ae au°., 


(5.42) 


Now frcto (5.33), we have 
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Equations (5.39), (5.40), (5.42) and (5.43) constitute the 
solution of the problem. 

The preceding problem can be adapted to obtain 
3El° i* 1 the Nash solution. The main theme of this method 
is to assume some k initially and move suitably to a new 
value depending upon the gradient of I, the Nash 
Supercriterion. We show below that 'this involves the 
solution of a nonlinear programming problem. 


By the partial differentiation of (5.25), we have 


= I u — i i 

t e <» k p (w £ -v«) $ . 


(5.44) 


We move in the K space by a small displacement vector 
G such that 



where 6 is a small number and such that 

c . 1 , C N _ 
t+ ... c =0 

-t p < e p < i - *, p = i, ... n 


(5.45) 

(5.46) 

(5.47) 


to minimize 

61 = 2 e P - (5.48) 

P dkP 

Constraint (5.4^) is meant to keep the new )c P also 
positive. Finding G^ to minimize (5.48) satisfying 
(5.45) - (5.47) constitutes the familiar nonlinear 
programing problem (Hadley 1964). 
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This problem can be solved by the general methods 
available to Convex Programming problems such as Gradient 
Projection method and the method of Feasible Directions. 
However, the problem being cne of minimizing a linear 
function subject to one quadratic and a number of linear 
constraints, it can be solved by a special technique 
(Panne 1966) which terminates in a finite number of 
iterations compared to the general methods. The method 
suggested is a combination of the Simplex method and a 
parametric version of the Simplex and dual methods for 
quadratic programming (Dantzig 1963). The detailed rules 
and a simple example are given by Panne (1966). The lack 
of nonnegativity constraints on the variables can be taken 
care of by defining auxiliary variables. 

After finding £_ we change £ and u° 
iteratively. Thus after the i*** 1 iteration, we have 

u - u° + 2 6. (5.49) 

1 ■ dU 1 

; 

*S + i - fcP i + <*■»> 

where subscripts refer to the iteration numbers. Utilizing 

u^ + ^ as the new guess in the scalarized optimal control 

problem with the corresponding convex combination represented 

by fc. , we obtain the optimal u° _ by a suitable 
. “1+1 d + l 

numerical technique. Techniques, which assure convergence 
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when the assumed control is very near the optimal control, 
are enough for this purpose. 


This iterative procedure can be repeated till we 
reach the optimal value of in the Nash solution along 

with the corresponding optimal control u°(W°). 


5.6 A El CRITERION OPTIMAL CONTROL PRDELEM 

In this section, we apply the results of the 
earlier sections to an illustrative example. The system 
is the familiar double integral plant with bounded 
inputs, i.e., ' 


X 1 = x 2 

Xg = U 1 + CU 2 

lu 1 ! ^ i ; |U 2 | < 1 


(5.51) 


(5.52) 


The inputs are to be chosen to minimize the criteria 


i t-p 

jl t x 0 ’ u l = * dt 

J a [x 0 ,u] = x f ^lu 1 ! + b|u 2 |^dt 


(5.53) 


while driving the state of the system from an arbitrary 
point x Q at time t = 0 to the origin, i.e., 
x (t f ) = x 2 (t f .) =0. The terminal time t f is assumed 
free . 

Such a model might represent the single-axis 
attitude control of a satellite which is equipped with an 
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lectric motor as well as reaction jets, for control purposes, 
bile the batteries driving the electric motor are rechargeable 
y solar radiation, the gas consumed for the reaction jets 
annot be replenished. The problem formulated corresponds to 
, maneuver which should be performed in minimum time, consuming, 
.inimum fuel. The electric motor is primarily used for achieving 
he minimum time criterion while the reaction jets are sparingly 
.sed with the objective of total fuel minimization. 

As the allocation of the control resources to the 
riteria is already made in a natural manner, we can get the 
ptimal solution as the Nash cooperative solution. The non- 
ooperative solution of this problem, which is already presented 
n Section 3.4 (Example 3.2) and Section 4.4, is necessary for 
;he Nash Super criterion. 

Tbr the different Pareto optimal points, we solve the 
icalarized optimal control problems with the criterion 
Unction als (parametrized by fci ) given by 
t 

[x 0 ,u,H = X f + (!-K t ) ( | u 1 ! + b|u 2 |) ^dt (5.54) 

‘or 0 ^ ic f ^ 1. 

ppli cation of the Minimum Principle ; 

To determine the optimal strategies, we minimize 
he Hamiltonian H with respect to u 1 and u 2 ‘ where 

[= |* + ( 1- k f ) (ju 1 ! + t»Ju 2 { ) + k-jXg + A^CuVcu 2 ) (5.55) 
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subject to the control variable constraints (5.52) and 



(5.56) 


Hence we get 



dez 


*2 



dez 


* 2 C 
b(l-Jc' ) 


(5.57) 


This problem is a two-input version of the problem given in 
Athans and Falb (1966). The optimal control law is stated 
below, the proof of which is indicated in Appendix D. 


Tbr the various cases, the optimal control sequences 

are tabulated in Table 5.1. In particular, when c = b and 

fc* « 0, the optimal .control does not exist for certain 

starting points. However, similar to the one-input case 

discussed by Athans and palb (1966), 6 — optimal controls 

exist. In this and some other cases (for example when 

c > b and It' = — )lt may also be noted that there are 

l-b+-c 

■uncountable number of Pareto optimal points corresponding 
to one sea lari zed optimal control problem with the same jc 1 . 


The switching surfaces V- 


0 ^ ^ 1 and £ 


1 £ ■ 

for £= 0, 1, 2 

2 


Vjg « x i = 


2(i+ce) 


and 1 for 
are defined below. 



(5. 68) 



TAEL E 5* 1 

Optimal Control Sequences 


Case 


Condition 


i c-b 
> > 


1 - c-b 


°> b r = s 


k'< 

1-b+c 


Sequences 


*1 +1 0 +1 + 1 J 

+ 1 0 0 0 +£- 

+1 +1 0 +1 +1 


o < e $ i 


?1 0 0 0 
+1 *1 o +1 


] “* l 


±1 o 

+1 ±1 


fc. = o 


0 < K < 1 


fc * 1 


0 ^ £ 1> e 2 ^ 1 


+ 1 0 +1 

+1 0 +1 

*1 +1" 

+1 +1 


c < b 




* b-c 




?i +i 
+1 0 

+1 +1 
fl 0 


o +1 +1 
0 0 +1 

o +1 ±1 

o o +e 


o <e ^ i 


+1 +1 o +1" 

r±i +ii 

*1 0 0 0. 

and 

i +1 0 






128 


: x i= " 


2 

*2 


2(e+c) 

*£ 2 


sgn 


*2^ 


(5.59) 

(5.60) 


V t = (Cx^Xg) : X]L = - ~ x 2 sgn x g ^ 
where ^q, ^ and ^g are gi ve n below for the case c > b 


jp 2 C + 3^(1- fc 1 ) (o-b) - (l-fc*)* (b-c) a 

t p V 

fc 2 c(l+c) 


. t 


tx2 


a 


<=C - 


1 

V C 

o( + 4(1- )b 

0 »» 
t c 


f 1 to 


, * c- b 

^ 1-b+c 

(5.61) 
. , c-b 


(5.63) 


t'2 


« = 




for . *' > iSfe 


[k'-(c-b)(l-c')] 8 . 

(5.63) 


. , (1- fc)(3-b)[2)c-(l- tt)(c-b)] r -fc’ 2 

^ 1 *.. • ’ 2 c / [ K- T “(c-b) (1- 


-W)l 2 ] 


for fc < 


' • c-b 


1-b+c 


The switching curves for the case c > b are shown, 
in Figure 5.1. The curves r o > and Pg move away from 
as decreases and when yj = , P 0 coincides 

with >T , and I”! coincides with the x -> - axis. As fc.’ 

oi a ry\.-*''S~ x 

. ■ •••• ■ 

decreases further, V 9 



ism According to the 

5,1 








TABLE 5.2 

Value Function (V 1 ,!/ 2 ) of Cooperative solution (Case 



Continued. 
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*2 


1 

c 


1 + V 


sgn °<s, + | rj° CP} + - sgn %) 
1+°^C , J- c c 


'1 

.2 


_ k + c-b I l-c« 0 


1+c 


p 2 = - ~ sgn o< 2 + 


M 


1+oCpC 2 h 

CP x + c sgn *a> 


(5. 66) 


(5.67) 

(5.68) 


From the entries in Tables 3.1 and 5.1, it is clear 

that a substitution of the form 

, x 2 

<r*p 

X 1 2* sgn *2 (5.69) 

parametrizes the state space by <jT . This will result in all 
the Value functions (W^jW 2 ) and (V 1 ,!/ 2 ) proportional to 
jxgjj the proportionality factors being functions of <f . 

For the solution of the Bicriterion Optimal Control 
Problem for any initial state (x^xg), we have to find 

fc ° such that it minimizes the Nash Supercr iter ion 

KxijXg,^ ) = ~[w Hx^Xg) - V 1 (x 1 ,x 2 , t 1 )] [¥ 2 (x 1 ,x^) 

- V 2 (x 1 ,x 2 ,^)] 

Thus we have (5.70) 

Kx^Xg, ^ f °) = min Kx^Xg, It' ) (5.7i) 

fc 

The minimization in (5.7l) is done over the It’ corresponding 

to which ]£( Jc T ) dominates M (i.e. ^ V(k' ) W * 

. * . ' . ( & 



Once again it is easy to infer from (5.69) that the 
above minimization depends only on cf and not on (x-^xg). 
Though analytical expressions, for fc* 0 in terms of <T, c 
and b are difficult to obtain, it can be concluded that 
is constant on a curve given by (5.69). The nature of 
this curve is the same as the various switching curves shown 
in Figur e 5. 1. 

Computational Experience with the Example : 

The difficulties associated in the numerical 
computation of the noncooperative solution of differential 
games are already noted in Section 4.5 (see also Starr 1969). 
Because of the multi sided nature of the optimization involved, 
the methods applicable to optimal control are not directly 
applicable to the case of differential games without drastic 
modifications* 

For the case c * b of the example considered, with 
imperfect information to the players, the Conjugate Gradient 
Method (Lasdon et.al. 1967) is tried for each player both in 
alternate iterations and in alternate optimizations. In 
either case, the free terminal time is determined by 
satisfying 
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The terminal condition of reaching the origin is sought to 
be met by introducing the Penalty Function ^MP|| x f |l and 
modifying the performance index 

jP' = jP + lM P || x f || S (5.73) 

Thus the problem is converted into a free and point problem. 
5br the constraints on the control variables we follow the 
modification by Pagurek and Woodside (1968). The instability 
which arises in this technique can be attributed to the 
reasons listed in Section 4.5^. 


Tbr the Nash Cooperative Solution of this problem, 
the theory presented in Section 5.5 is not directly applicable 
because the Hamiltonian is linear in the control variables 
which are bounded. This we have 

a 0 (5.74) 


thus making (5.42) inapplicable. However, because there are 
only two players, the Pareto optimal points are parametrized 
by a scalar parameter fc’. Thus the method is unnecessary; 
The optimal value %'° can be determined by a simple search 
technique such as assuming \c = 1 and changing it in small 
steps in a one- dimensional search to minimize the Nash 
Super criterion. Conjugate Gradient Method II (Pagurek and 


Wood side 1968), which gives go 
is close to the optimal one, i: 
control is taken as 


i||. J .v; 


Convergence when the guess 
The resulting optimal 
of t*. 
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The procedure involves the solution of several 
scalarized optimal control problems with different values 
of the parameter fc l , with each problem having Penalties 
to meet the terminal constraints. The convergence and 
accuracy of the results depend upon the Penalties chosen 
at each step. The Penalty Function approaches are not well 
developed for the optimal control problems in contrast to 
the case of Nonlinear Programming Problems (Fiacco and 
McCormick 1968). Another salient feature already discussed 
earlier is the presence of uncountable number of Pareto 
optimal points corresponding to the same value of fc.' and 
all of them being relevant in the minimization of the Super- 
criterion. These problems are continuing to receive attention. 


6.7 CONCLUSIONS 


In this chapter, the Pareto optimality concept is 
discussed in detail along with the cooperative solutions of 
differential games. The necessary conditions for Pareto 
optimality are equivalent to a scalarization of the vector 


minimization. 


Mul tier! ter ion optimal control problems are solved 
as N-person differential games without sidepayments and with 
equal information to all the players. Nash solution is 
suggested as a solution and a computational procedure is 
suggested by utilizing a certain sensitivity problem in 


optimal control. 




SililSl -v ■ llllSIiiSSSB 
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The solution of the Bicriterion Optimal Control 
of the double integral plant, some aspects of which are 
presented in earlier chapters, is completed. In the next 
chapter, we discuss some problems of current control- 
theoretic interest in the light of the results in this 
chap te r . 



chapter VI 


MULTI CRITERION OPTIMAL CONTROL UNDER UNCERTAINTY 

6.1 INTRODUCTION 


In the earlier chapters, we studied N-person 
differential games and multicriterion optimal control 
problems under a deterministic framework. We relax this 
restriction in this chapter on the lines indicated in 
Chapter II. This enables us to consider some of the 
current problems of Interest in Optimal Control Theory 
in the light of the results obtained so far in this thesis 


The general formulation in Chapter II gives rise 
to stochastic Differential Games with imperfect and in- 
complete information. A general class of stochastic 
differential games with perfect information are studied 
by Kushner and Chamberlain (1969). Markov Positional 
Games, studied by (Sarma et.al. 1969, Ragade 1968), are 
stochastic differential games with imperfect information. 
Finite games with incomplete information appeared recently 
(Harsanyi 1968). 


In Section 6,2, we present the signal design 
problem for system identification. It will be shown that 

this problem can be solved as a stochastic optimal control 

' '■ ■ ' " . " ■ . 
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problem. This is followed by a discussion on Multi- 
criterion Stochastic Optimal Control. .An example of a 
linear system with two inputs and two quadratic criteria 
is worked out. Extensions to Adaptive Control - the 
problem of control under uncertainty - are also indicated 
in Section 6.3. 


The main result in Chapter V, that multicriterion 
optimal control problems can be solved as N- person 
differential games without sidepayments but with the added 
restriction that all the players have equal information, 
still holds under the stochastic formulation. Thus 
mul ticritcrion stochastic optimal control problems are" 
easier to solve compared to the general stochastic 
differential games in which the information to the various 
players is different. The results presented 
chapter are mainly exploratory in nature. 

6 .8 SIGNAL DESIGN TOR SYSTEM IDENTIF1 CATION 


in this 



Here we consider the signal design problem for 

the Identification of a system. This problem arose 
originally in the context of cent rolling an uncertain 


plant. The uncertainty might be in terms of some para- 



meters in the plant dynamics or its impulse response in 
the linear case. In the earlier literature (for a surv 



vi€ re suggested with schemes to continuously monitor the 
controller parameters depending upon the identification 
system, which is understood in the sense of the unknown 
parameter estimation. In this connection, there have 
been attempts at designing separate input signals for 
the purpose of identification. Now such a problem is 
described below in the time domain. 


We consider a dynamic system whose state x g 
and measurable 1 outputs y, of dimensions n g and m 
respectively, satisfy a known form of dynamic equations 

*s = f s<V V u > W S’ (6,1 

y = b s (x s , W 2 , t) (6.2 


where u denotes the r - dimensional vector input to the 
system, w g and w 2 are uncorrelated white gaussian 
noises of dimensions p^ and with means zero and 

covariances © g and © 2 respectively. The set of 
unknown parameters, which cannot be observed directly, is 
represented by x^. These parameters can as well be in 
(6.2). Depending upon the statistical characteristics and 
any markov property satisfied by them, they can be written 



' / . 


1 The outputs are 

sensors. 
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By appending x.^ with x g to form a single vector 
one can write (6.1) -(6.3) as 

x = f(x, u, w,, t) \ 


y = h(x, w 2 , t) 

Jbotnote 2 is valid for '.(6.4) and (6.5) as we ll. 


(6.4) 

(6.5) 


Ibr any u, the estimated state trajectory x of 
the system (6.4) and (6.5) includes x^ the parameter 
trajectory. If there is no other control task, the freedom 
in the choice of u can be exercised in minimizing the 
criterion functional which depends upon the errors in 
estimation and the cost of the inputs over the fixed time 
interval [t Q , tj] of interest 


t 0 ] = B* [ f I x-x|| + 1 

t„ t Q(t) 


+ 1*11 \dt 
R(t)- > 


where 


£(t) * B[x(t) 1 y*l 


( 6 . 6 ) 


(6.7) 


In (6*6) j y* represents the cumulated observations upto 


time t, 


^ r < * $ 

Equation (6.5) can also he written as 


( 6 . 8 ) 


j£u,to3 %to ^ 


I y 0 ] 




(6.9) 


where y ° 


. . j — 
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in [t 0 ,t f ], 


J [u] 


S 

x 


^11 x-xil 


+ IN } dt 

Q(t) R(tK 


( 6 . 10 ) 


In (6.6) and (6.10), Q(t) is chosen to suitably weigh 
only the unknown parameters in the total state. 


This problem (6.1)-(6.10) is termed as the signal 
design problem for system i den ti f icst ion. Thus the formu 
lation is similar to the stochastic optimal control problem. 
By the principle of optimality it follows that u °[T,t f 3’ 
a segment of the optimal control u° for any r in [t 0 ,t f l, 

should minimize 

J[u,r] * Byt. \ [ I y 1 ( 6 . 11 ) 

The solution of the problem u°(y^,t) or alternatively 
n°(p(*ly t ), t) is given by solving Bellman’s Value equation 
and a modified Chapman-Kolmogorov equation for P(‘|y ) 
along with the system equations. By averaging the 
expression for the conditional density p(*iy ) > w® 6®t x. 
We consider below a simple example, which envelopes the 
problem studied by Levadi (1966) using the reproducing 
kernel Hilbert space. 
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where the state x and the observations y are of 
dimensions n and m respectively, w.^ and w p are 
uncorrelated white noises of dimensions P 1 and P 2 with 
mean zero and of covariances 0 ^ and respectively. 

The matrices A> C, D-^ and Dg are of proper dimensions. 
We obtain the input u, as a program (or open loop control 
law) independent of x or x, so as to minimize the 
performance index 

= Ejc ; f ||x-x|| 2 +||u|| 2 at (6.13) 

t 0 Q(t) B(t) 


Since the system equations are linear in x and 
the noises are white gaussian, the optimal filter which 
gives x, is of the Kalman type, 


5k « A(u,t) x + PG JC © 2 1 (y-Cx) (6.14) 3 

where P = cov (x-x) is the covariance of the error which 
satisfies the KLccati equation 


P = A r (u,t) P + PA(u, t) - PC^Dg©" 1 DgGP + 1^©^ 

( 6 . 15 ) 



cov (x(O) 


with the initial condition P(t n ) 
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JW minimizing (6.16) subject to (6.15), we can apply the 
Matrix Minimum Principle (Athans 1966), i.e., we have 


H(p,cf,u,t) = < P,Q > + < U,RU>+ <i:,P> 

4= -|f 5 *<V“ 0 


and 

if = H(P,^, u°,H - min H(P,^,u,t) (6, 19) 

ms U° Is obtained by minimizing the Hamiltonian (6.17). 
A similar procedure is used by Athans and schweppe (1967) 
to solve the problem of synthesizing an optimum modulation 

signal* 


Ibr the more general system in (6.4) and (6.5), 



the filter equations can be assumed to be the suboptima 
filter given by Schwartz (1966). Par the subop timal 
Schwartz filter, the equation for the error covariance P 
is an ordinary differential equation. As such, the ma r x 

minimum principle can similarly be applied to determine 
the optimal input u°. This problem may be termed as a 
signal design problet for a system with a constrained 

state estimator. 

n-n nvrwt at nnw TROL WITH UNCERTAINTY 
6.3 rattOBiw! 001,11,01 

thi section we discuss the optimal control 
of an uncertain P 
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The control u Is to be chosen as a function of the 
cumulated observations y fc , to minimize for p = 1, ...N. 

J P I>,t 0 ] * % [ ef P (x fJ t f ) + X tf L p (x,u,t)dt ] (6.20) 

fc o 

The multiple criteria arise in many a situation, as for, 
example, when one is interested simultaneously in the 
identification and in the control of the system. 


Stochastic Version t 

If the statistical characterization of x in 
( 6. l) ig complete in the problem, then the plant can be 
represented by (6.4) and (6.5) with the noise terms 
appearing linearly* Because of the presence of the noise 
terms, the resulting problem is termed stochastic. The 
information is imperfect because the observations cannot 
yield the exact state vector* 


Even though each criterion in (6.20) will be able 
to introduce a total ordering on the set of control policies, 
the criterion functional vector will not be able to do so. 
This is similar to the deterministic case and the solution 
is given in terms of the ncninferior control laws. Once 
again a noninferior control law u° on the interval [t 0 , t f ] 
is defined as in (2.24) a»d (jkXQ)' with respect to 


is defined as in (2.J 

J p [u,t ] in (6.20). 

; 


20 ) 
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By the principle of optimality, u° for any 

\ty tf] 

t in tt 0 » tj>] Is also noninferior on the subinterval 
[t,t f J. Under suitable convexity conditions in the cost- 
vector space forth® feasible controls, it may be possible 
to scalar! se the vector criterion, we assume such a 
scalar ixation in the example considered below. 


The Supercriterion, for selecting one noninferior 
control as the solution 6t the problem, depends upon the 
noncooperative solution of the game. We present below an 
example to Illustrate the ideas involved. 


Example 3.2 t We consider a linear system with two 


quadratic criteria ahd with noisy observations. 
& * kit) X + B^(t) u 1 + B S (t) u 8 + D^Ct) w x 
y a C(t) X + £^(t) ^ 


( 6 . 21 ) 


where the state x, observations y and the control 

' ' - g 

variables u 1 and u 2 are of dimensions n^r 1 and r 


respectively. In (6.21), and w 2 are uncorrelated 



zero-mean white noises of dimensions and f g and with 

covariances G_ and ?he matrices A, B 1 , B , C, 

l * 

and are of proper dimensions. 

■ - ' V fig', ' : . V' 1 ' ■ \ 

The objective is to minimize the performance 
indices given for p * 1,2 as 
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where 





The problem formulated Is similar to the two-person game 
formulated by Rhodes (1969) in which he assumes D = Q = 0. 
Rhodes also assumes two different observation equations for 
the players (see also Rhodes and Luenberger 1969 and Behn 
and Ho 1968 for the two-person zero- sun version). 


The flash equilibrium solution is obtained by 
Rhodes when one of the players has either null or perfect 
observations. We present the solution to the Bicriterion 
Control Problem where, by assumption, both the players have 
imp erfeot but equal observations. Thus for p — Ij2 
we have 


u p *(t) * - 1 (t) BP T (t) ^(t) x(i4 (6.24) 



where 

S p * - S P A - A p - cf - if If 



( 6 . 

(6.27) 


and 

P - A + PA - PC^e^CP + D x e dJ 
P(t 0 ) * cov (x Q ) 

The proof of this result can be given by showing 
the optimality of each player’s control law in his one- 
sided optimal control problem. We follow the results of 
Rhodes (1969) and Rhodes and Luenberger (1969), making use 
of the Optimal Return functions for the players. Thus 
we have, with S^t^) * & and b p (t f ) = 0, 

l^(x,t) « x T S p (t) x + b P (t) (6.28) 

The optimal control u p (t) is obtained as the argument 
which minimise « 

[(^^i)!/] (6.29) 

Hence for u^Ct), we get 

ul*(t) * arg min S DJx^S 3 * 4* b 1 -f Rs-^xC Ax + B^u 1 

- B 2 Bf" 3 B 82? S 2 ^ 2 ) + llx|f. 
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The minimum value of (6.29) equated to zero gives the 
equations satisfied by s 1 and b 1 as 


S 1 = “A^S 1 ** S 1 A + sWr^VV + aVlfW 

m 

+ 1 B 2l S 1 - Q 1 


_ . (6.31) 

b 1 = -Tr[p(8S 1 B 1 Rl' 1 B ir s 1 + S^lf'W + ' 

O Jfc 

The second player's control u (t) is similarly proved 

to be optimal. 




The control law (6.24) is intuitively obvious. 
Along with the control law for the deterministic problem 
with perfect observations (3.48), it follows that the 
Separation Theorem (Wbnham 1968) is valid for both the 
players. 

The Pareto optimal control laws are obtained by 

considering the scalar! zed stochastic optimal control 
problem given by (6.21) with the performance index 


j[u,t * it* j 1 [u,t 0 ] + a-iO -rf^toj ( 6 . 32 ) 

Obviously the Separation Theorem is valid for this class 
of problems and the control law is given by 


150 


where 

SU f > = - A r SU') - S(k f )A - Q(tf) + 


+ SCk*)B?R 8 " 1 aLt) B 8l sCtl ) 
S(t f ,|c l ) * fc'F 1 + (l-lct)f^ 


(6.34) 


QC^ 1 ) = K'Q 1 + ( 1- 1* ) Q 2 (6.35) 

and for p = 1,2 


R P (\c t ) = ^'R 1 + ( 1-V ) R^ (6.36) 

jP * 

This can be proved, as in the case of the Nash equilibrium 

By 

solution, by assuming a quadratic form for the Optimal 
Return Junction V(x,t,t'). 


By assuming the Nash equilibrium solution (6.24) 
as the nonco operative solution, the optimal V- to corres- 
ponding to the Nash cooperative solution is obtained by 
minimizing the Nash Super criterion I with respect to 

jc 1 , where 

Uy* 0 , t 0 ,fc') “-[w^y* 0 , t 0 ) - V 1 (y t °,t 0 ,)c')][ii( 2 Xy to )t 0 ) 

- V 2 (y t o ) X 0) |*)l (6.37) 


Next we consider the case when the parameters x^ 



are not statistically characterized completely. 

; ■ 

Adaptive Version s 


control und< 
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the problem Into Identification and Control i s considered 
essential and the notion of Adaptation is associated with 
the two attributes of Identification and learning. However, 
this splitting of the problem into two levels) similar to 
the multilevel techniques (Mesarovic 196 ), ig subjective. 
Further some of the assumptions in the earlier literature 
are difficult to justify. 


In the stochastic version) where the unknown 
parameters in the system are statistically characterized 
completely) we saw that the controller requires in general 
the current probability density of the state vector 
(including the parameters) and this can be viewed as a 

Learning Process. In contrast to this situation, if the 

% 

statistical characterization of in (6,1) i s incomplete , 
then even a single criterion will be unable to introduce a 
total ordering cn the set of control policies. The 
resulting partial ordering is due to the presence of the 
uncertainty. Only in this case, one is justified in calling 


the problem Adaptive and a Supercriterion is essential to 



resolve the dilemma (Sworder 1966). a possible super- 
criterim Is to complete the statistical characterization of 
x in a worst sense and design a minim ax controller, (also 
see Ragade and Sarma 1967 for the deterministic case). This 

(Sworder 23 6 6 ) and exhibits learning as explained earlier 
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ia the case of stochastic optimal control. Thus the 
application of aipercrl terion for the solution is what 
distinguishes an Adaptive Control Problem. 


In the presence of more than one criterion, the 
Supercriterion applied for an adaptive Control Problem 
should thus reflect the resolution of tie dilemmas caused 
due to the presence of both the uncertainty and the multiple 
criteria* 


The worst-case type of Super criterion can be 
obtained for this problem from the Theory of Approach ability 
and Excludability of Blackwell (1956) for finite two-person 
zero- sura games with the payoffs being vector- valued. 

Harsanyi (196 r) initiated tlx solution of finite games with 
incomplete information. These results will be of considerable 
significance in arriving at a solution to the above problem. 
This is suggested for further research. 


6.4 CONCLUSIONS 


The Theory of Stochastic Differential Games for 
tile cases of imperfect and incomplete information to the 
players is relatively new with very few significant results 
and holds premise as a potential area for Be search* The 
problem of Multicriterion Optimal Control of an uncertain 


plant - both the Stochastic 


are; 
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simpler problems of the above theory with a ll the players 
having equal information. These problems can be solved 
malting u®e of the concepts of Blackwell (19J^) and 
Harsanyi (1968). 





*>Y'i 

ipra® 


CHAPTER VII 
CONaUSLuNS 


N HPerson Differential Games are a class of 
Infinite Games with a continuum of strategies and a 
continuum of moves to the players. They can also be 
considered as multisided generalizations of Optimal 
Control Problems. The solution concepts of Finite 
Games are generally applicable to N-Person Differential 
Gaffes. Thus the solution depends upon the Information 
Patterns to the play* rs and the level of Cooperation 
between the players. Differential Games find applicar 
tion in Economics, Warfare and System Design. 

Since any practical system design usually 
requires the satisfaction of several basically different 
objectives, the study of Multi criterion Optimal Control 
Problems assumes great importance. It is shown that they 
can be solved as Cooperative N -Per son Differential Games 
without sidepayraents and with equal information to all 
the players. The solution obtained in this way reflects 
the Tradeoff Factor s between the various criteria in a 
game- theoretic sense. It is the equal information 

y ' ' - - ' 

feature which makes the solution of these problems 
comparatively easier 
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In this Thesis, a study of N-Person Differential 
Games is undertaken mainly in a deterministic framework. 
Their solution shows many similarities with the solutions 
of Finite Games on the one hand and of Optimal Control 
Problems on the other. These include existence of 
multiple Equilibrium Solutions (including an uncountable 
number of them ) as in Finite Games and the application of 
a Minimum Principle as in Optimal Control. However they 
also exhibit features unknown in these areas, Fpr example) 

unlike in optimal control, they show a wide variety of 

'V 

switching surfaces (studied recently in the case of two- 


person aero- sum games). Though the study of these 
surfaces at this stage is mainly motivated through 
examples, a thorough understanding of these surfaces is 
an essential prerequisite to a comprehensive theory of 
Differential Games. 



Many types of Information Patterns to the players 
are possible in Differential Games. In this Thesis, the 
two extreme* of null and perfect information are considered. 
Consideration of partial information to the players, along 
with introduction of mixed strategies will permit a study 
of larger classes of games. 

The study of games with 
information to the player# 1# 

Multi criterion Optimal Contr 

tmOmBum 
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X 


A thorough study of these problems requires extra 
mathematical concepts such as the recently developed 
Calculus of I to. Such concepts are presented for 
Optimal Control and Two-Person Zero-Sum Games in a 
recent Thesis by Mandke (1969). 

Lastly, the numerical solution of Differential 
Games brings out certain problems peculiar to them. Some 
of these problems are suggested for further investigation 
and are continuing to receive attention. 



APPENDIX A 


SWITCHING CURVE REACHING THE ORIGIN 

Ebr the Bicriterion Differential Game discussed 

in Ghapters III, IV and V, we derive here the equation for 

the switching curve along which the Initial state . (s^s^) 

1 2 

reaches the origin under the control law (u x ,u ). 

The game satisfies the state equations 


X 1 = x 2 

* 1 2 
x 2 = u x + cu 


(A* l) 


1 2 

The Payoff EUnctionals J and J for the players are 


defined as 


J^sju] = S dt 
0 

J 2 [s,u] = f f £|u 3 'l + b| u 2 | dt 


(A. 2) 


On integrating C-4. l) with the initial condition 
x 1 (0) = Sp x 2 (0) = s 2 we have 

+ .,(ui + cu 2 ) t 

1 f-n 1 


x 2 s ' s 2 


x i = s i + '* s ? t + i ( uX + cu2 ) t2 


(A.3) 


For the system state tb reach brigin, there must exist 
some tj. > 0 with x^Ctf) = 0, x 2 (t^) - 0 iri (A^3). 



Hence we get 


t f 


uVcu 2 


1 2 

sgn So = -sgn(u + cu J ) 


(A.4) 


2 

s 2 


311,5 S l = '^.L„,2 


2(u +cu ) 

We now define the switching curve -/ _ 0 as 

' u\r 


follows i 


Y l2 = s i> Sg) ; sgn s 2 = -sgnCu^cu 2 ), 

= - S 2 7 

1 2(^4- cu 2 ) J 


(A.5) 


The values of the payoff functionals (A. 2) are therefore 

given by 


J 1 [s,u] 

J J L s ) u] 



S2 

t„ 

1 12 
u^+cu^ 


= f tf [lu 1 ! + b|u 2 |j« 
= [ lu 1 ! + b|u 2 |^t f = - 


1 u 1 ! +b|u 2 j s 2 

1 2 
U x + cu^ 


(A.6) 


After obtaining the terminal sequence and the 
associated equations (A* 5) and (A.6), we proceed to 
construct the noncooperative or cooperative solution of 
the game by satisfying the required corner conditions. 



APPENDIX B 

evaluation of value pungti ons 

Pbr the Bicriterion Differential Gajfle discussed 
in Chapters IIII, IV and V, we give here a general result 
for the evaluation of the Value Functions. 


The game satisfies the state equations 


x ! = x 2 

« 1 2 
Xg = U + CU 

The Payoff Pbnctionals for the players are given by 

t 

‘ o '” 1 = 


(B.l) 


jl [ x o’ u l = J” f dt 


(B. 2) 


J 2 [x 0 ,u] = ! f [ lu 1 ! + b|u 2 | | 


dt 


Suppose the equation of the curve F in 
Figure B.l is given by 


P = { ( s 1 ’ s 2 ) : s l = " 


°< sf 


(B.3) 


and the Value PUnctions on F are assumed as 

^CsijSg) = ? 1 U 2 I 
V^Csp s 2 ) = 0^1 S' | 


(B.4) 


Wfe find the Value Functions for points in the 

IP 

re si on below where the optimal control is (u ,u )• 






A typical optimal traje-ctory starting from .Cr^r ) and 
reaching V in ( s 1? s 2 ) is shown in Figure B.l and let 
t be the time taken. Then solving (B. l) we have 


Since 


s 2 = r 2 + (u 1 + «u ) t 

s = r + r 0 t + J (uVcu 2 ). t 2 
X 1 O 


(s 1 ,s 2 ) lies oh T, we get 


, 1 z i 2 x llll i q 2 

l +r 2 t+ +cu ) t = - | [r 2 + (u^+ cu 2 ) t] 


(B*5) 


.2 (u^cu 2 ) 


[l+oc^+cu 2 )] + t r^l+ocCu^cu 2 )] 
2 


+ l r i " 


CB.6) 


Solving (B.6) for t, we get 


t _ r 2 + 1 r 2~ 2r ii n ' ) 

u^-h-cu 2 ” u^cu 2 \j l+ocCuVcu 2 ) 


(B.7) 


where" the second value corresponds to the dotted curve 
symmetrical with respect to P. 

4 ' 

Substituting (B.7) in (B.5), we have 


r 2 -2ri(u 1 +cu 2 ) 

l+ocCu^+cu 2 ) 


(B*8) 


Once again the second value corresponds to the dotted 


curve in Figure B.l. Hence 


.4) and (B.8) the 
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Equation (B.9) giving the Value functions for 
the trajectories reaching P and its symmetric counterpart 
in Figure B.l enables us to determine the Noncooperative 
and Cooperative Value Functions of the game. 



APPENDIX G 


EQUILIBRIUM SEQUENCES POR THE NONCOOPER&TIVE SOLUTION 


Here we obtain the Nash equilibrium sequences 
for the Bicriterion Differential Game discussed in 
Chapter IV. The game satisfies the differential equation 


X 1 = x 2 


(G.l) 

The Payoff Functionals for the players are given by 

*» 

(G. 2) 


• n ? 

Xg =5 u + + cu w 


1 ^f 

J [x 0 ,u] = f dt 

° 0 


J 2 Cx 0 ,d = + b ld 2 |]dt 


The application of the transversality 
conditions and the minimum principle was shown in 
Example 3.2. The resulting equations are (3.75), (3.76), 
(3.66) and (3.7 1). By making use of these equations, we 
obtain the terminal sequences for the various cases 
c > b, c = b and c < b that arise in the problem. 

Thus for example if we assume the terminal 
sequence as 

ul = -1 ; u 2 = 0 (C.3) 

We have from (3.76), 

^I^tf ) = AgCb^,) — 1* 


(C.4) 
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From (3*66), (C.4) and (G.3), it is required that 


1 < or c < b 


(C.5) 


Thus for the sequence [* (and similarly for the sequence 

r+rr L 

V q 1) to be an equilibrium sequence, c < b is necessary. 
Similarly, the terminal sequences I*- 1 ! and f 

buj LieJ 

were shown to be optimal for the cases c >b and e = b 
in Examples 3.2 and 4.1 respectively. Starting from these 
terminal sequences, we should construct the equilibrium 
sequences by satisfying the corner conditions at the 
appropriate switching surfaces. The procedure is similar 
to that shown in Example 3.2 where this was worked out for 
the case e > b. We present below only the remaining cases. 

Case (l) ; c < b _ •' 

Pbr this case, the sequences I"- 1 + 1 71 1 are 

I ±1 0 0 J 

in equilibrium provided the condition 2o*>b^+2bc+c^ > 6 is 

also satisfied. When this condition is not satisfied, the 

1 +11 

1. We prove this for the 

0 OJ 

sequences given by the upper values. Figure G. l(i)’ shows 
typical plots of A 2 } u^ and u 2 for this case. The 
corresponding switching curves, shown in Figure 4.3(i), are 
constructed below. The state of the game at tg and t 
is denoted by (r^, r 2 ) and (s^ s p ) respectively. 


equilibrium sequences are j~— 1 4-1 
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Applying the comer conditions at the switching 
surface ^“ 0 corresponding to the terminal sequence ^ 
we can easily see that the A variables are continuous on 
this switching surface. Thus we have 

Ai ( V } = A^(t 2 +) = X l^ 3 ~ ) = A l (t 3 “ ) *■ " SJ 

CG.6) 

^2^2 + ^ = c 

i . 

Now from (3. 71) and (C.6), we have 


VS = 0 S 2 


(G. 7 ) 


Solving the dynamic equations (3.62) we have 


56 r » + c S 2 


— p + p — g [_ m . 


,2 _ 

1 b 2 

****** «c 


(G*8) 




Eliminating (s^, 


y^ we get the equation of 


Si '§ 3 .V * -j 


the switching curve as 






r' = [ (r l>V : = - «' f 3 am r 2 


■ t 


where 


.» _ b 2 -c 2 -2bc 

v2 


C< = 


(o-b) 

Also from (C.6) an$ (G.8), we have 

V Ct 2 +) = *i Ct 2*> = 


(C. 9 ) 


(C.10) 


ca* id 
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*' • j 


Applying the comer conditions (3.25) on the 
switching curve f 1 , with (r^jtg) 

\ cr Z “*1^2“^} (_0< r 2^ + f" c ~ ^2 (t 2~0 = ° 

L 2 ( C 

1 + A^(tg-)r 2 +xl (t 2 -) (uVcu 2 ) = 0 

(G. 12) 

^ v>} + [- 1 - a5(v0 = 0 

ju 1 ! + h|u 2 l +A^(t 2 -)r 2 + Ai(V) (uVcu 2 ) = o 

Equations (G.12) can he solved consistent with (3.7l) and 
(3.66) only under the imposed condition 

2c - b 2 + 2bc + c 2 > 0 (C.13) 

■ can also be written as 


o( r < 


(6.14) 


Then the result is that is continuous and that 

u l(t) = u 2 Ct) = +1 for t < t 2 CC.15) 

lAlhen the condition (G<13) is not met, there are no 
equilibrium sequences reaching T • 

Case (2) : c = b __ 

r* +1 +1 +1 J .. - 

In this case, the sequences ^ 0 +6 J W 

0 < e ^ 1 are in equilibrium, lfe prove this for the 
sequence given by the upper values- B“,ipre C.l(ii) shows 
typical plots of the adjoik and control variables and 
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Also from (3.7l) and (C.19), we have 




* <£) / (85ft,) - % 


(C.21) 


Solving the system equations (3.62) we get 

s 2 = r 2 + Ct Q - tg) = r g + s 2 (C.22) 

Hence the switching curve V is given by 

r = [ (r i’ V ! r 2 = °} (C,23) 

A second application of the corner conditions at this comer 
proves the result. 


(C. 23) 


Equation (C.18) also yields, under the imposed 

* p 

condition 6 < , the following control law 

c 

uHt) * 1 ; u 2 (t) = -1 for t < t 3 (G. 24) 

The corresponding plot of the adjoint and control variables 
is not shown. It may be noted that the control law (C.24) is 
similar to (4.23) and can also be obtained similarly. 


After obtaining these and other sequences 
exhaustively, the noncooperative solution is defined by 
a suitable selection of the sequences. 




APPENDIX D 


PROOF OF THE COOPERATIVE SOLUTION 

For the Bicriterion Differential Game discussed' in 
Chapters III, IV and V, we indicate here the proof of the 
Cooperative Solution. The game satisfies the equations 


±1 = Xg 

Xg = U-*- + cu 2 

with the Payoff Phnctionals defined as 

0 


(D.1) 


life jl Ls,u] = I f dt 


RIB 

2 


CD.2) 


"t o 

j‘[s,ul = ; lu 1 ! + b|u \l at 
0 c 


The state at the final time t f is specified as the origin. 

In Section 5.5, for the cooperative solution, we 
considered the scalarized optimal control problems with the 
criterion functionals parametrized by as shown in 

(5.54). The application of the minimum principle to this 
problem yielded (5.55) - (5.5?). 

By considering a small ball (3.72) around the origin 
as we did in Example 3.2 and applying the transvarsality 
conditions (5. 16 ), we get 


[X 1 ( t f ) Xg(tj>)3 


'5 sin@ -6 sin6' 


u^cu 2 


S cos0 


' ' ‘ ' '/•;*••• . . - ■ - 

H. r- 

j = L~ < K 


i -^2 


) |c •+ (i-'fc Klu^l+hju 


6] (D.3) 



xn 


Solving (D*3) and letting 6 -* 0, we have A ^(t^ ) as 
arbitrary and 


- + Cl- jp) du 1 ! + b|u 2 | ) 

(t^.) = n gn 




(u 1 + cu 2 ) 


1D.4) 


Now we obtain the optimal terminal sequences. 


For 



to be the, terminal sequence, we should 


have from (5.57) 

V + ( 1- fc- T ) (1+b) v (lrV)b 
1 + c ' c 

(D.5) 

fc* + (l-\c) (1+b) > 


Simplifying (D.5), we have either of the following : 


(i) 

c > b 

and ]fL > 

c-b 

1-b+c 

(ii) 

& 

V 

o 

and > 

b-c 

b 

(iii) 

c * b 




(D*6) 


Similarly the following 


terminal sequences are optimal for 


the cases noted against them. 



c < b and |c 
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Now the various optimal sequences and the switching 
surfaces can be constructed as shown in Example 3.2. The 
procedure is even simpler in the present case since the adjoint 
variables are continuous. A typical plot of the adjoint 
variable )\ 2 for the case c > b and is shown 


in Figure D. 1. Let us denote the state at tt g and t 4 as 
(r^, r 2 ) and (sp Sg) respectively. The corresponding 
switching surfaces j whose construction is indicated below, are 
shown in Figure 5, l(i). 


The equation of the switching curve along which the 
state reaches the origin corresponding to Figure D. 1 is 
obviously "/ Tj- as defined in (3.83). Also, from Figure D. 1, 
we have 

- 7 (1-fc') 


* 4**3 * 


tC-l-Cl-tf) (l+b) 


1+c 


But from (A. 6), (B.7) and (B.8), we have 


(D.8) 


4. t _ s 2 


1 4 -t = - 3 + i. 

4 w 3 -c -c 


S 2 = + 


v 


r| + 2r x c 


1- 


1+c 


r 2 * 2r i C 


o- 


1+c 


CD.9) 


4 , 


r /%*:•>• 1 

. 

' * it f ' ‘ 

, . . 
• 1 " r 1 


§l§: 


If the switching curve p * on which (r 1 ,r ) lie 

■ . . . ^ ■ ■ 


r 0 =[< 


Cr l> r 2> ! r l = - 


sgn r 2 ^ (D.10) 


2 
*2 


aWfiJt 
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The value of ^ is obtained by eliminating (ri>r2^> 

Cs^, sg)? tg, t and t^ in (D.8) - (D. 10), Thus we have 


fc* 2 c 4 ale (I -fc*) (c-b) - (1-fc:) 2 (b-c) 2 (IX11 ) 

t 2 c(l+c) 

By a similar technique the other switching curves 


given by 

^1 *£ ( x 1 , x 2 ) s 

r 2 : 



(D. 12) 


are constructed. The complete cooperative solution of the 
problem is obtained by repeating this procedure for the 
other cases. 
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